Challenge 1¶

1. Basics of Image and Signal Processing (LE1)¶

1.1. Image Properties¶

Day 1¶

Find or acquire 1-3 images related to your selected country Mexic0. The images should be suitable for demonstrating adjustments to image properties brightness and hue in experiments.

The pictures are from when i was living in Mexico.

repository at https://github.com/BR4GR/gbsv-challenges

code writen with the help of the Deep Dive repo chatgpt and opencv.org

In [1]:
import time
start_time = time.time()
import cv2 as cv
import librosa
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sounddevice as sd

from ipywidgets import interact, widgets
from matplotlib import pyplot as plt
from matplotlib.widgets import Slider, Button, TextBox
from scipy.io.wavfile import write
print(start_time)
1730149987.104151
In [2]:
jellyfish = cv.imread('img/jellyfish.jpg')
print(f"{jellyfish.shape=}\n{jellyfish.dtype=}\n{np.max(jellyfish)=}\n{np.min(jellyfish)=}")
jellyfish = cv.cvtColor(jellyfish, cv.COLOR_BGR2RGB)

plt.imshow(jellyfish)
plt.show()

cavediving = cv.imread('img/cavediving.jpg')
print(f"{cavediving.shape=}\n{cavediving.dtype=}\n{np.max(cavediving)=}\n{np.min(cavediving)=}")
cavediving = cv.cvtColor(cavediving, cv.COLOR_BGR2RGB)

plt.imshow(cavediving)
plt.show()

sunrise = cv.imread('img/sunrise.jpg')
print(f"{sunrise.shape=}\n{sunrise.dtype=}\n{np.max(sunrise)=}\n{np.min(sunrise)=}")
sunrise = cv.cvtColor(sunrise, cv.COLOR_BGR2RGB)
plt.imshow(sunrise)
plt.show()
jellyfish.shape=(3648, 5472, 3)
jellyfish.dtype=dtype('uint8')
np.max(jellyfish)=np.uint8(255)
np.min(jellyfish)=np.uint8(0)
No description has been provided for this image
cavediving.shape=(3648, 5472, 3)
cavediving.dtype=dtype('uint8')
np.max(cavediving)=np.uint8(255)
np.min(cavediving)=np.uint8(0)
No description has been provided for this image
sunrise.shape=(4000, 6000, 3)
sunrise.dtype=dtype('uint8')
np.max(sunrise)=np.uint8(255)
np.min(sunrise)=np.uint8(0)
No description has been provided for this image
In [3]:
def adjust_brightness(image, brightness=0):
    """
    Adjust the brightness of an RGB image.
    
    Parameters:
    - image: Input image in RGB format
    - brightness: Amount to adjust brightness (can be positive or negative)
    
    Returns:
    - Brightness-adjusted image
    """
    image = np.int16(image)
    image = image + brightness
    image = np.clip(image, 0, 255)  # Ensure values stay within [0, 255]
    return np.uint8(image)

def adjust_hue(image, hue_shift=0):
    """
    Adjust the hue of an RGB image.
    
    Parameters:
    - image: Input image in RGB format
    - hue_shift: Amount to shift the hue by (in degrees)
    
    Returns:
    - Hue-adjusted image
    """
    hsv_image = cv.cvtColor(image, cv.COLOR_RGB2HSV)
    hue = hsv_image[:, :, 0].astype(int)
    hue = (hue + hue_shift) % 180  # Ensure hue stays within the range [0, 179]
    hsv_image[:, :, 0] = hue.astype(np.uint8)
    return cv.cvtColor(hsv_image, cv.COLOR_HSV2RGB)

def apply_adjustments(image, brightness=0, hue_shift=0):
    """
    Apply brightness and hue adjustments to an RGB image.
    
    Parameters:
    - image: Input image in RGB format
    - brightness: Amount to adjust brightness
    - hue_shift: Amount to shift the hue by

    Returns:
    - Image with brightness and hue adjustments applied
    """
    bright_image = adjust_brightness(image, brightness)
    final_image = adjust_hue(bright_image, hue_shift)
    return final_image

# Function to display the original and adjusted images side by side
def display_images_side_by_side(image, brightness=0, hue_shift=0, ax=None, title="Image"):
    """
    Display the original and adjusted images side by side for comparison.
    
    Parameters:
    - image: The input image in RGB format
    - brightness: Amount to adjust brightness
    - hue_shift: Amount to shift the hue by
    """
    adjusted_image = apply_adjustments(image, brightness, hue_shift)
    
    # Original image
    ax[0].imshow(image)
    ax[0].set_title(f'Original {title}')
    ax[0].axis('off')
    
    # Adjusted image
    ax[1].imshow(adjusted_image)
    ax[1].set_title(f'Adjusted {title} (Brightness: {brightness}, Hue: {hue_shift})')
    ax[1].axis('off')

def display_all_images():
    """
    Display all images with fixed adjustments for brightness and hue.
    """
    # Fixed values for brightness and hue for each image
    brightness_jellyfish = -40
    hue_jellyfish = 40
    
    brightness_cave = -10
    hue_cave = 20
    
    brightness_sunrise = 10
    hue_sunrise = -20

    fig, axes = plt.subplots(3, 2, figsize=(12, 12))

    display_images_side_by_side(jellyfish, brightness_jellyfish, hue_jellyfish, axes[0], "jellyfish")
    display_images_side_by_side(cavediving, brightness_cave, hue_cave, axes[1], "cave diving")
    display_images_side_by_side(sunrise, brightness_sunrise, hue_sunrise, axes[2], "sunrise")
    
    plt.tight_layout()
    plt.savefig("img/adjusted_images2x3.png", dpi=300)
    plt.show()

display_all_images()
No description has been provided for this image

Day 2¶

Define a problem and use case related to your country profile (Steckbrief) for each assigned property that you intend to address (brightness and hue). Then define 1-2 experiments with objectives on how you want to address your defined problem and use case. Multiple experiments/objectives can be analyzed in one single image. Write concisely for each experiment: WHAT problem and use case do you address, HOW do you want to address/solve the problem for this specific use case, and WHY is your experiment helpful for the chosen use case?

Problem:¶

The jellyfish image was taken in a aquarium in playa del Carmen, the room was extreemly dark even thought it was full of different changing colorfull lights, which ilumnated the jelyfish. the image is too bright, and the variety of colors from the light show is not well represented. The dark room and colorful glowing effect of the jellyfish are lost due to overexposure. This results in the background becoming too visible and reduces the visual focus on the jellyfish, which were meant to glow against a dark backdrop.

Use Case:¶

Simulating the original light show where jellyfish glowed in different colors in a dark room. The objective is to recreate the low-light ambiance and simulate the various colors the jellyfish displayed during the light show using hue adjustments.

Objectives of the Experiments:¶

Brightness Adjustment: Reduce the overall brightness to restore the dark environment, ensuring that the jellyfish glow remains the visual focus. How: Lower the brightness to darken the background while keeping the jellyfish visible, mimicking the darkroom atmosphere. we will test and compare different methodes of brightness ajustment, like gamma, region speciffic brightness, or clahe.

Hue Adjustment:* Simulate the changing colors of the light show by adjusting the hue. This allows us to create the effect of different colored lights shining on the jellyfish, even though the image only shows one color. How: Shift the hue to enhance the jellyfish’s glow and replicate the dynamic color changes seen during the light show.

Explanation, Why are these adjustments relevant?¶

Brightness: The dark room is an essential part of the jellyfish exhibit, where overexposure distorts the intended visual experience. Reducing brightness restores this intended ambiance.

Hue: The light show was dynamic and colorful, and adjusting the hue allows us to simulate the color transitions that were key to the experience. This makes the image more visually accurate and engaging.

Day 3¶

Conduct part 1 of your experiments with your images and suitable methods. Reason your choice of parameters and methods using visualizations or 1-2 sentences emach.

In [4]:
def adjust_gamma(image, gamma=1.0):
    """
    Adjust the gamma correction of an image.

    Parameters:
    - image: Input image in RGB format
    - gamma: Gamma value for correction (default is 1.0)
    
    Returns:
    - Gamma-corrected image
    """
    inv_gamma = 1.0 / gamma
    table = np.array([(i / 255.0) ** inv_gamma * 255 for i in np.arange(0, 256)]).astype("uint8")
    return cv.LUT(image, table)

def apply_region_specific_brightness(image, brightness_value):
    """
    Apply region-specific brightness adjustment to an image.

    Parameters:
    - image: Input image in RGB format
    - brightness_value: Amount to adjust brightness (can be positive or negative)
    
    Returns:
    - Image with region-specific brightness adjustment applied
    """
    mask = image[:, :, 2] < 100  # Apply brightness adjustment to the darker regions only
    brightened_image = adjust_brightness(image.copy(), brightness_value)
    image[mask] = brightened_image[mask]
    return image

def apply_clahe(image, clip_limit=2.0, tile_grid_size=(8, 8)):
    """
    Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to an image.
    
    Parameters:
    - image: Input image in RGB format
    - clip_limit: Threshold for contrast limiting
    - tile_grid_size: Size of grid for histogram equalization

    Returns:
    - Image with CLAHE applied
    """
    lab_image = cv.cvtColor(image, cv.COLOR_RGB2LAB)
    clahe = cv.createCLAHE(clipLimit=clip_limit, tileGridSize=tile_grid_size)
    lab_image[:, :, 0] = clahe.apply(lab_image[:, :, 0])
    return cv.cvtColor(lab_image, cv.COLOR_LAB2RGB)


def apply_vignette(image, radius_factor=2, brightness_reduction=0.5):
    """
    Apply a vignette effect to an image.

    Parameters:
    - image: Input image in RGB format
    - radius_factor: Factor to control the size of the vignette
    - brightness_reduction: Amount to reduce brightness at the edges

    Returns:
    - Image with vignette effect applied
    """
    rows, cols = image.shape[:2]
    center_x, center_y = cols // 2, rows // 2
    # Create a Gaussian kernel that will serve as the vignette mask
    # Calculate the maximum possible radius
    max_radius = np.sqrt(center_x ** 2 + center_y ** 2) / radius_factor
    # Create meshgrid for the X and Y coordinates
    x = np.arange(cols)
    y = np.arange(rows)
    X, Y = np.meshgrid(x, y)
    # Calculate the distance from the center of the image
    distance_from_center = np.sqrt((X - center_x) ** 2 + (Y - center_y) ** 2)
    # Normalize the distances to a range between 0 and 1, then reverse (1 at center, 0 at edges)
    vignette_mask = np.clip(1 - (distance_from_center / max_radius), 0, 1)
    # Apply brightness reduction (e.g., 0.5 means reduce brightness by 50% at the edges)
    vignette_mask = vignette_mask ** brightness_reduction  # Adjust the strength of reduction
    # Apply the vignette mask to each channel (RGB)
    vignette_mask_3d = vignette_mask[:, :, np.newaxis]  # Make mask compatible with 3-channel image
    vignette_image = np.uint8(image * vignette_mask_3d)
    return vignette_image


# Experiment: Create three levels of brightness adjustment (80%, 60%, 40%)
brightness_levels = [-15, -30, -45]
gamma_values = [0.85, 0.7, 0.55]
vignette_values = [0.85, 0.7, 0.55]
clahe_clip_limits = [1.0, 2.0, 3.0]

# Create a 3x3 grid for each method and level of adjustment
fig, ax = plt.subplots(5, 3, figsize=(15, 13))

# Row 1: Linear brightness adjustment
for i, brightness in enumerate(brightness_levels):
    adjusted_image = adjust_brightness(jellyfish.copy(), brightness)
    ax[0, i].imshow(adjusted_image)
    ax[0, i].set_title(f'Linear Brightness: {100 + brightness}%')
    ax[0, i].axis('off')

# Row 3: Region-specific brightness adjustment
for i, brightness in enumerate(brightness_levels):
    region_adjusted_image = apply_region_specific_brightness(jellyfish.copy(), brightness)
    ax[1, i].imshow(region_adjusted_image)
    ax[1, i].set_title(f'Region-Specific: {100 + brightness}%')
    ax[1, i].axis('off')
    
# Row 2: Gamma correction
for i, gamma in enumerate(gamma_values):
    gamma_corrected_image = adjust_gamma(jellyfish.copy(), gamma)
    ax[2, i].imshow(gamma_corrected_image)
    ax[2, i].set_title(f'Gamma Correction: {gamma}')
    ax[2, i].axis('off')

# Row 4: CLAHE with different clip limits (strengths)
for i, clip_limit in enumerate(clahe_clip_limits):
    clahe_image = apply_clahe(jellyfish.copy(), clip_limit=clip_limit)
    ax[3, i].imshow(clahe_image)
    ax[3, i].set_title(f'CLAHE: Clip Limit = {clip_limit}')
    ax[3, i].axis('off')

# Row 1: Linear brightness adjustment
for i, brightness in enumerate(vignette_values):
    adjusted_image = apply_vignette(jellyfish.copy(), 1, brightness)
    ax[4, i].imshow(adjusted_image)
    ax[4, i].set_title(f'vignette Correction: {brightness}')
    ax[4, i].axis('off')


plt.tight_layout()
plt.savefig("img/brightnessexperiment.png", dpi=300)
plt.show()
No description has been provided for this image

Reasoning for Methods and Parameters:¶

Gamma Correction was chosen for its ability to apply non-linear adjustments, which better handles darkening shadows while preserving highlights, making it the best method for enhancing the contrast between the glowing jellyfish and the dark background. CLAHE was chosen for localized contrast enhancement, particularly in low-contrast areas (background). Region-specific adjustments were used to target specific areas like the background, maintaining the focus on the jellyfish. The vignette effect was used as a creative approach to enhance the vintage feel and control the outer brightness. Parameters were chosen by trial and error, using large and small stept untill a godd step size was found for each Method.

Observations:¶

The best result was achieved with Gamma Correction (0.55), which effectively darkened the background while preserving the vibrant, glowing jellyfish. The vignette effect added a vintage feel but wasn't the best fit for the goal of simulating the aquarium light show.

In [5]:
dark_adjusted_jellyfish = adjust_gamma(jellyfish.copy(), 0.55)

# Display the original and adjusted images side by side
fig, ax = plt.subplots(1, 2, figsize=(12, 6))
ax[0].imshow(jellyfish)
ax[0].set_title('Original Jellyfish')
ax[0].axis('off')

ax[1].imshow(dark_adjusted_jellyfish)
ax[1].set_title('Adjusted Jellyfish: Gamma = 0.55')
ax[1].axis('off')
cv.imwrite("img/dark_adjusted_jellyfish.png", cv.cvtColor(dark_adjusted_jellyfish, cv.COLOR_RGB2BGR))
plt.tight_layout()
plt.savefig("img/dark_adjusted_jellyfish_comparison.png", dpi=300)
plt.show()
No description has been provided for this image

Day 4¶

Conduct part 2 of your experiments with your images and suitable methods. Reason your choice of parameters and methods using visualizations or 1-2 sentences each.

In [6]:
hue_shifts = [-80, -60, -40, -20, 0, 20, 40, 60, 80]

# Create a 3x3 grid for hue adjustments
fig, ax = plt.subplots(3, 3, figsize=(12, 12))

for i, hue_shift in enumerate(hue_shifts):
    hue_adjusted_image = adjust_hue(dark_adjusted_jellyfish.copy(), hue_shift=hue_shift)
    
    row = i // 3
    col = i % 3
    
    ax[row, col].imshow(hue_adjusted_image)
    ax[row, col].set_title(f'Hue Shift: {hue_shift}°')
    ax[row, col].axis('off')

plt.tight_layout()
plt.savefig("img/hue_experiment.png", dpi=300)
plt.show()
No description has been provided for this image
In [7]:
# Apply hue shifts to different quadrants of the image
def apply_quadrant_hue_shifts(image, hue_shifts):
    """
    Apply different hue shifts to each quadrant of the image.

    Parameters:
    - image: Input image in RGB format
    - hue_shifts: List of hue shifts for each quadrant (in degrees)
    """
    # Define the quadrants
    #jellyfish.shape is (3648, 5472, 3)
    rowsplit = 2000
    columnsplit = 3150

    top_left = image[0:rowsplit, 0:columnsplit]
    top_right = image[0:rowsplit, columnsplit:]
    bottom_right = image[rowsplit:, columnsplit:]
    bottom_left = image[rowsplit:, 0:columnsplit]
    
    # Apply hue shifts to each quadrant
    top_left = adjust_hue(top_left, hue_shifts[0])
    top_right = adjust_hue(top_right, hue_shifts[1])
    bottom_right = adjust_hue(bottom_right, hue_shifts[2])
    bottom_left = adjust_hue(bottom_left, hue_shifts[3])
    
    # Reconstruct the full image with the adjusted quadrants
    top_half = np.hstack((top_left, top_right))
    bottom_half = np.hstack((bottom_left, bottom_right))
    adjusted_image = np.vstack((top_half, bottom_half))
    filename = f"img/quadrant_hue_shifts{hue_shifts[0]}_{hue_shifts[1]}_{hue_shifts[2]}_{hue_shifts[3]}.png"
    cv.imwrite(filename, cv.cvtColor(adjusted_image, cv.COLOR_RGB2BGR))
    print(f"Image saved as {filename}")

apply_quadrant_hue_shifts(dark_adjusted_jellyfish.copy(), [0, 40, 80, 120])
apply_quadrant_hue_shifts(dark_adjusted_jellyfish.copy(), [20, 60, 100, 140])
Image saved as img/quadrant_hue_shifts0_40_80_120.png
Image saved as img/quadrant_hue_shifts20_60_100_140.png

Hue shifts 0, 40, 80 and 120¶

awesome Image

Hue shifts 20, 60, 100 and 140¶

awesome Image

In [8]:
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

# Function to apply smooth sliding hue shift from 0° to 180° across the image
def apply_smooth_sliding_hue_shift(image):
    """
    Apply a smooth sliding hue shift from 0° to 180° across the image, left to right.
    
    Parameters:
    - image: Input RGB image.
    
    Returns:
    - Image with a smooth sliding hue shift applied from left to right.
    """
    hsv_image = cv.cvtColor(image, cv.COLOR_RGB2HSV)
    cols = image.shape[1]
    
    # Create a sliding hue mask based on the column position
    for col in range(cols):
        # Compute the hue shift smoothly as a proportion of the column position
        hue_shift = int((col / cols) * 90)  # Scale hue shift between 0 and 90
        hsv_image[:, col, 0] = (hsv_image[:, col, 0] + hue_shift) % 180  # Apply the hue shift to the column
    # Convert back to RGB
    sliding_hue_image = cv.cvtColor(hsv_image, cv.COLOR_HSV2RGB)
    
    return sliding_hue_image

smooth_sliding_hue_image = apply_smooth_sliding_hue_shift(dark_adjusted_jellyfish.copy())
cv.imwrite("img/smooth_sliding_hue_shift_0_to_90.png", cv.cvtColor(smooth_sliding_hue_image, cv.COLOR_RGB2BGR))
Out[8]:
True

awesome Image

Reasoning for Methods and Parameters:¶

The uniform hue shifts were chosen to clearly visualize how different color adjustments affect the overall image. This method allowed for a straightforward comparison of how each hue shift impacts the brightness and color of the jellyfish.

The quadrant-based hue shift and th sliding shift was chosen mostly to see what it looks like, and because the different effects are closer together without whitespaces in between. the parameters where chosen to display the biggest amont of differnt shifts, as there was not really atarget shift. in the show there where also all colors present.

Observations:¶

The results resemble the effect in real live better than i expected, it pretty much looks exactly how i remember it. The gif looks way worse than expected the problem is gif uses les color.

In [9]:
# Function to resize image to a smaller resolution using Lanczos interpolation
def resize_image(image, width=1000):
    """
    Resize the image while maintaining the aspect ratio, using Lanczos interpolation for better color quality.
    
    Parameters:
    - image: Input image in RGB format
    - width: The desired width of the output image
    
    Returns:
    - Resized image with improved interpolation
    """
    height = int(image.shape[0] * (width / image.shape[1]))  # Maintain aspect ratio
    # Use Lanczos interpolation for better quality when resizing
    return cv.resize(image, (width, height), interpolation=cv.INTER_LANCZOS4)

# Resize the image using the updated function
resized_image = resize_image(dark_adjusted_jellyfish.copy(), width=1000)


# Create frames for the GIF
def generate_hue_shift_frames(image, output_folder, num_frames=30):
    """
    Generate frames for a smooth hue shift animation.

    Parameters:
    - image: Input image in RGB format
    - output_folder: Folder to save the frames
    - num_frames: Number of frames to generate

    Returns:
    - List of frame paths
    """
    step = 180 // num_frames  # Define the hue step size
    frames = []
    
    for i in range(0, 180, step):
        hue_shifted_image = adjust_hue(image.copy(), i)
        frame_path = f"{output_folder}/frame_{i}.png"
        cv.imwrite(frame_path, cv.cvtColor(hue_shifted_image, cv.COLOR_RGB2BGR))  # Save frame
        frames.append(frame_path)
    
    return frames

# Generate frames for the resized image
frames = generate_hue_shift_frames(resized_image, output_folder='hue_frames', num_frames=60)
In [10]:
from PIL import Image

# Create a GIF from the saved PNG frames with dithering
def create_gif(frames, output_path, duration=100):
    images = [Image.open(frame) for frame in frames]
    
    # Save the GIF with dithering enabled to help smooth color transitions
    images[0].save(
        output_path, 
        save_all=True, 
        append_images=images[1:], 
        duration=duration, 
        loop=0, 
        dither=Image.Dither.FLOYDSTEINBERG,  # Apply dithering
        optimize=False  # Disable further color optimization to avoid additional color loss
    )
    
# Create the GIF with dithering to improve color quality
create_gif(frames, output_path='img/hue_shift_animation_dithered.gif', duration=100)

GIF¶

Hue Shift Animation

Day 5¶

Measure your results using appropriate methods. Reason your choice of methods and parameters using visualizations or 1-2 sentences. Select specific measurement methods that are particularly suitable for your use case. Analyze the histograms of the original images and, if applicable, during your experiments.

In [11]:
from scipy.stats import skew
from skimage.measure import shannon_entropy

def adjust_brightness(image, brightness=0):
    """
    Adjust the brightness of an RGB image.

    Parameters:
    - image: Input image in RGB format
    - brightness: Amount to adjust brightness (can be positive or negative)

    Returns:
    - Brightness-adjusted image
    """
    image = np.int16(image)
    image = image + brightness
    image = np.clip(image, 0, 255)  # Ensure values stay within [0, 255]
    return np.uint8(image)

# Function to apply gamma correction
def adjust_gamma(image, gamma=1.0):
    """
    Adjust the gamma correction of an image.

    Parameters:
    - image: Input image in RGB format
    - gamma: Gamma value for correction (default is 1.0)

    Returns:
    - Gamma-corrected image
    """
    inv_gamma = 1.0 / gamma
    table = np.array([(i / 255.0) ** inv_gamma * 255 for i in np.arange(0, 256)]).astype("uint8")
    return cv.LUT(image, table)

# Function to plot histograms for brightness adjustments for multiple methods
def plot_multiple_brightness_histograms_log(original_image, methods_dict):
    """
    Plots histograms for the original image and multiple brightness adjustment methods with a logarithmic scale.
    
    Parameters:
    - original_image: The original input image.
    - methods_dict: A dictionary where keys are method names and values are the corresponding adjusted images.
    """
    # Convert original to grayscale
    original_gray = cv.cvtColor(original_image, cv.COLOR_RGB2GRAY)
    
    # Number of methods + 1 for the original image
    num_methods = len(methods_dict)
    
    plt.figure(figsize=(15, 6))
    
    # Plot original image histogram (with log scale on y-axis)
    plt.subplot(1, num_methods + 1, 1)
    plt.hist(original_gray.ravel(), bins=256, range=(0, 256), color='gray')
    plt.yscale('log')  # Set y-axis to logarithmic scale
    plt.title('Original Image (Log Scale)')
    plt.xlabel('Pixel Intensity')
    plt.ylabel('Frequency')
    
    # Plot histograms for each method with log scale
    for i, (method_name, adjusted_image) in enumerate(methods_dict.items(), 2):
        adjusted_gray = cv.cvtColor(adjusted_image, cv.COLOR_RGB2GRAY)
        plt.subplot(1, num_methods + 1, i)
        plt.hist(adjusted_gray.ravel(), bins=256, range=(0, 256), color='gray')
        plt.yscale('log')  # Set y-axis to logarithmic scale
        plt.title(f'{method_name} (Log Scale)')
        plt.xlabel('Pixel Intensity')
    
    plt.tight_layout()
    plt.savefig("brightness_comparison_histograms_log.png", dpi=300)
    plt.show()

def calculate_brightness_kpis(image):
    """
    Calculate Key Performance Indicators (KPIs) for brightness of an image.

    Parameters:
    - image: Input image in RGB format
    
    Returns:
    - Dictionary of KPIs (Mean, Std Dev, Skewness, Dynamic Range, Entropy)
    """
    gray_image = cv.cvtColor(image, cv.COLOR_RGB2GRAY)
    
    # Selected KPIs
    mean = np.mean(gray_image)
    std_dev = np.std(gray_image)
    skewness = skew(gray_image.ravel())
    dynamic_range = np.max(gray_image) - np.min(gray_image)
    entropy = shannon_entropy(gray_image)
    
    return {
        "Mean": mean,
        "Std Dev": std_dev,
        "Skewness": skewness,
        "Dynamic Range": dynamic_range,
        "Entropy": entropy
    }

# Function to calculate KPIs for multiple methods
def compare_brightness_kpis(original_image, methods_dict):
    """
    Compare Key Performance Indicators (KPIs) for brightness of an image across multiple methods.
    
    Parameters:
    - original_image: The original input image.
    - methods_dict: A dictionary where keys are method names and values are the corresponding adjusted images.

    Returns:
    - Dictionary of KPIs for the original image and each adjusted image.
    """
    print("KPIs for the Original Image:")
    original_kpis = calculate_brightness_kpis(original_image)
    for kpi_name, value in original_kpis.items():
        print(f"{kpi_name}: {value}")

    for method_name, adjusted_image in methods_dict.items():
        print(f"\nKPIs for {method_name}:")
        adjusted_kpis = calculate_brightness_kpis(adjusted_image)
        for kpi_name, value in adjusted_kpis.items():
            print(f"{kpi_name}: {value}")


# Example images: original and two different brightness-adjusted methods
original_image = jellyfish.copy()

# Apply different adjustments
gamma_fish = adjust_gamma(original_image.copy(), gamma=0.55)  # Gamma correction
cv.imwrite("img/gamma_fish.png", cv.cvtColor(gamma_fish, cv.COLOR_RGB2BGR))
dark_fish = adjust_brightness(original_image.copy(), brightness=-60)  # Linear brightness adjustment
cv.imwrite("img/dark_fish.png", cv.cvtColor(dark_fish, cv.COLOR_RGB2BGR))



# Step 1: Plot histograms for multiple methods
methods_dict = {
    "Gamma Correction (0.55)": gamma_fish,
    "Linear Brightness (-60)": dark_fish
}

plot_multiple_brightness_histograms_log(original_image, methods_dict)

# Step 2: Compare KPIs for multiple methods
compare_brightness_kpis(original_image, methods_dict)
No description has been provided for this image
KPIs for the Original Image:
Mean: 39.41889767163935
Std Dev: 37.269260229864
Skewness: 1.9444142675661276
Dynamic Range: 254
Entropy: 6.587779365349093

KPIs for Gamma Correction (0.55):
Mean: 15.993016981988047
Std Dev: 27.93977790955635
Skewness: 3.5483630400647086
Dynamic Range: 255
Entropy: 5.137508582224073

KPIs for Linear Brightness (-60):
Mean: 11.016358849597953
Std Dev: 23.555591835182078
Skewness: 3.527971057920704
Dynamic Range: 195
Entropy: 4.078345291811171

Gamma (0.55)¶

awesome Image

Linear Brightness (-60)¶

awesome Image

Why the metrics were chosen: The selected metrics directly address the use case of analyzing brightness adjustments. By using these metrics, we can quantitatively determine how the two adjustment methods (gamma correction and linear brightness adjustment) differ in terms of their effects on the image. Mean and standard deviation were chosen as basic indicators of brightness and contrast. Skewness was chosen to see whether the adjustments cause the image to become more biased towards darker or lighter regions. Dynamic Range was selected to quantify how much detail remains after adjustment. Entropy was chosen to measure how much image information and texture is preserved.

Why specific parameters were chosen: Gamma (0.55) was chosen because it darkens the mid-tones more than the highlights or shadows, which is suitable for the jellyfish image where we want to keep glowing areas intact while darkening the background. Linear Brightness (-60) was chosen to uniformly darken the image, providing a straightforward comparison with the more complex gamma adjustment. Both parameters were found by trial and error.

Day 6¶

Select the best results from your experiments. Summarize observations and interpretations about your results and discuss your findings and results in relation to your problem and use case in approximately 150 words.

Both gamma correction (0.55) and linear brightness adjustment (-60) were applied to simulate a dark environment with glowing jellyfish, similar to the original scene. Upon visual inspection, the gamma correction preserved the overall mid-tone contrast but exaggerated some of the bright spots, especially around the center-left of the image. This caused parts of the image to become overexposed and lose subtle detail. In contrast, the linear brightness adjustment resulted in a more balanced effect where the jellyfish glowed softly, and the bright spots remained controlled, making it a better choice for retaining natural highlights.

From a quantitative standpoint, gamma correction resulted in a higher skewness and slightly higher entropy than linear adjustment. However, entropy alone is not the best indicator of quality in this case, as the linear adjustment kept the bright spots intact and avoided oversaturation, preserving the aesthetic and intended visual effect of the jellyfish glowing in a dark room.

Therefore, linear brightness adjustment is the most suitable method for this specific use case of preserving glowing objects in a dark environment.

In [12]:
time_now = time.time()
passed = time_now - start_time
print(f"Time taken: {passed:.2f} seconds")
Time taken: 212.58 seconds

1.2. Signal Properties¶

Day 7¶

Find or acquire 1-3 signals related to your selected country. The signals should be suitable for demonstrating adjustments to signal properties frequency and smoothing in experiments.

In [13]:
audio, sampling_rate = librosa.load("sound/AztecWhistle.wav")
print(f"{audio=}")
print(f"{audio.dtype=}")
print(f"{audio.shape=}")
print(f"{np.max(audio)=}")
print(f"{np.min(audio)=}")

from IPython.display import Markdown, Audio

display(Markdown(f"**AztecWhistle**"))
display(Audio(audio, rate=sampling_rate))
audio=array([0.00283813, 0.00253296, 0.00274658, ..., 0.        , 0.        ,
       0.        ], dtype=float32)
audio.dtype=dtype('float32')
audio.shape=(121174,)
np.max(audio)=np.float32(0.9999695)
np.min(audio)=np.float32(-0.9477539)

AztecWhistle

Your browser does not support the audio element.
In [14]:
times = np.arange(0, len(audio)) / sampling_rate
plt.figure(figsize=(12, 4))
plt.plot(times, audio)
plt.title("Original Aztec Death Whistle Signal")
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.show()
No description has been provided for this image

The Aztec death whistle is an ancient ceremonial instrument used by the Aztecs, known for producing an intense, eerie sound similar to a human scream. It was often used in rituals and psychological warfare to create fear and panic.

Day 8¶

Define a problem and use case related to your country profile (Steckbrief) for each assigned property (frequency and smoothing) that you intend to address. Then define 1-2 experiments with objectives on how you want to address your defined problem and use case. Multiple experiments/objectives can be analyzed in one single signal. Write concisely for each experiment: WHAT problem and use case do you address, HOW do you want to address/solve the problem for this specific use case, and WHY is your experiment helpful for the chosen use case?

problem and use case

  • Understanding why certain sounds (like the Aztec death whistle) cause emotional and physical reactions can provide insights into how sound has been used in warfare or rituals. Analyzing the whistle’s signal can help us understand the acoustic properties that trigger these reactions, and experimenting with modifications can demonstrate how small changes can reduce or enhance these unsettling characteristics.

Experiment 1: Smoothing the Signal Using Low-Pass Filtering

  • Objective: To smooth the chaotic, sharp transitions in the sound to make the signal less jarring, while observing how smoothing affects the emotional impact of the sound.

  • Method: We will apply low-pass filters with different cutoff frequencies (e.g., 1kHz, 2kHz) to remove the higher, more chaotic frequencies that contribute to the whistle's harshness. This will result in a smoother, softer version of the sound. After applying the filters, we will compare the original and smoothed signals both visually (waveform analysis) and aurally (listening).

  • Why This is Helpful: Smoothing reduces the sharp, abrupt changes in the sound, potentially making the sound less aggressive and more palatable. This experiment will help demonstrate how removing high-frequency content impacts the emotional response to the whistle, making it useful for educational or research purposes where the sound's intensity needs to be reduced.

Experiment 2: Dynamic Frequency Shifting (Frequency Modulation)

  • Objective: To experiment with frequency shifting (modulation) techniques to shift the whistle's frequencies up or down the spectrum and analyze how these shifts alter the sound’s emotional effect.

  • Method: We will apply frequency modulation by shifting the whistle’s base frequencies up and down by 500 Hz, 1000 Hz, and 2000 Hz. This will allow us to experiment with how the pitch-shifting of frequencies changes the emotional impact of the sound.

  • Why This is Helpful: By shifting the whistle’s frequency spectrum, we can explore how changes in pitch impact the listener’s perception. The goal is to see whether lowering or raising the frequencies diminishes or enhances the unsettling quality of the sound.

Experiment 3: Temporal Smoothing with Savitzky-Golay Filter

  • Objective: To smooth the signal over time, reducing the sharp changes in amplitude while maintaining the core characteristics of the whistle’s tone.

  • Method: We will apply a Savitzky-Golay filter, a popular smoothing algorithm, to smooth the time-domain signal. This will help reduce the abrupt transitions in the sound without affecting its frequency components. We will compare the amplitude envelope before and after smoothing to measure the impact on the sound.

  • Why This is Helpful: Smoothing the time-domain signal will reduce the jarring effect of sudden loud noises while retaining the whistle's tonal characteristics. This is especially useful for environments where the sound’s intensity needs to be softened while maintaining its core identity.

Day 9¶

Conduct part 1 of your experiments with your signals and suitable methods. Reason your choice of parameters and methods using visualizations or 1-2 sentences each.

In [15]:
import librosa.effects

# Function to shift the frequencies
def shift_frequencies(audio, sampling_rate, shift_in_hz, direction="up"):
    """
    Shift the frequencies of an audio signal by a specified amount.

    Parameters:
    - audio: Input audio signal
    - sampling_rate: Sampling rate of the audio signal
    - shift_in_hz: Amount to shift the frequencies by (in Hz)
    - direction: Direction of the shift ("up" or "down")

    Returns:
    - Audio signal with shifted frequencies
    """
    if direction == "up":
        shift_in_steps = 12 * np.log2((shift_in_hz + sampling_rate / 2) / (sampling_rate / 2))
    elif direction == "down":
        shift_in_steps = -12 * np.log2((shift_in_hz + sampling_rate / 2) / (sampling_rate / 2))
    
    # Apply the pitch shift
    shifted_audio = librosa.effects.pitch_shift(audio, sr=sampling_rate, n_steps=shift_in_steps)
    return shifted_audio

shifted_1000_up = shift_frequencies(audio, sampling_rate, 1000)
shifted_2000_up = shift_frequencies(audio, sampling_rate, 2000)
shifted_3000_up = shift_frequencies(audio, sampling_rate, 3000)
shifted_1000_down = shift_frequencies(audio, sampling_rate, 1000, direction="down")
shifted_2000_down = shift_frequencies(audio, sampling_rate, 2000, direction="down")
shifted_3000_down = shift_frequencies(audio, sampling_rate, 3000, direction="down")

write('sound/shifted_1000_up.wav', sampling_rate, shifted_1000_up)
write('sound/shifted_2000_up.wav', sampling_rate, shifted_2000_up)
write('sound/shifted_3000_up.wav', sampling_rate, shifted_3000_up)
write('sound/shifted_1000_down.wav', sampling_rate, shifted_1000_down)
write('sound/shifted_2000_down.wav', sampling_rate, shifted_2000_down)
write('sound/shifted_3000_down.wav', sampling_rate, shifted_3000_down)

For the frequency shifting experiment, the method of frequency modulation was chosen because altering the base frequencies of a signal directly impacts how listeners perceive pitch. Frequency modulation allows us to explore whether increasing or decreasing pitch intensifies or reduces the emotional impact, providing insight into the acoustic properties that make the sound disturbing.

The parameters for the frequency shifts were chosen by trial and error.

In [16]:
from IPython.display import Markdown, Audio

# Function to plot a single signal and provide audio playback
def plot_signal_with_audio(audio_signal, title, sampling_rate, audio_label):
    plt.figure(figsize=(10, 4))
    plt.plot(times, audio_signal)
    plt.title(title)
    plt.xlabel("Time (s)")
    plt.ylabel("Amplitude")
    plt.tight_layout()
    plt.show()

    # Display the audio player with a label
    display(Markdown(f"**{audio_label}**"))
    display(Audio(audio_signal, rate=sampling_rate))

# Plot the original signal
plot_signal_with_audio(audio, title="Original Signal", sampling_rate=sampling_rate, audio_label="Original Audio Playback")

# Plot +1000 Hz and -1000 Hz shifted signals
plot_signal_with_audio(shifted_1000_up, title="+1000 Hz Shifted Signal", sampling_rate=sampling_rate, audio_label="+1000 Hz Shifted Audio Playback")
plot_signal_with_audio(shifted_1000_down, title="-1000 Hz Shifted Signal", sampling_rate=sampling_rate, audio_label="-1000 Hz Shifted Audio Playback")

# Plot +2000 Hz and -2000 Hz shifted signals
plot_signal_with_audio(shifted_2000_up, title="+2000 Hz Shifted Signal", sampling_rate=sampling_rate, audio_label="+2000 Hz Shifted Audio Playback")
plot_signal_with_audio(shifted_2000_down, title="-2000 Hz Shifted Signal", sampling_rate=sampling_rate, audio_label="-2000 Hz Shifted Audio Playback")

# Plot +3000 Hz and -3000 Hz shifted signals
plot_signal_with_audio(shifted_3000_up, title="+3000 Hz Shifted Signal", sampling_rate=sampling_rate, audio_label="+3000 Hz Shifted Audio Playback")
plot_signal_with_audio(shifted_3000_down, title="-3000 Hz Shifted Signal", sampling_rate=sampling_rate, audio_label="-3000 Hz Shifted Audio Playback")
No description has been provided for this image

Original Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

+1000 Hz Shifted Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

-1000 Hz Shifted Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

+2000 Hz Shifted Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

-2000 Hz Shifted Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

+3000 Hz Shifted Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

-3000 Hz Shifted Audio Playback

Your browser does not support the audio element.

Day 10¶

Conduct part 2 of your experiments with your signals and suitable methods. Reason your choice of parameters and methods using visualizations or 1-2 sentences each.

In [17]:
from IPython.display import Markdown, Audio
from scipy.signal import butter, filtfilt

# Function to apply low-pass filter
def apply_low_pass_filter(signal, sampling_rate, cutoff_hz):
    # Design a butterworth low-pass filter
    nyquist = 0.5 * sampling_rate
    normal_cutoff = cutoff_hz / nyquist
    b, a = butter(N=5, Wn=normal_cutoff, btype='low', analog=False)
    filtered_signal = filtfilt(b, a, signal)
    return filtered_signal

# Apply low-pass filters with different cutoff frequencies
low_pass_1000 = apply_low_pass_filter(audio, sampling_rate, 1000)
low_pass_2000 = apply_low_pass_filter(audio, sampling_rate, 2000)

# Plot the original and low-pass filtered signals with audio playback
def plot_filtered_signals_with_audio():
    plt.figure(figsize=(24, 12))

    # Original signal plot
    plt.subplot(3, 1, 1)
    plt.plot(times, audio)
    plt.title("Original Signal")
    plt.xlabel("Time (s)")
    plt.ylabel("Amplitude")

    # Low-pass filtered (1000 Hz) signal plot
    plt.subplot(3, 1, 2)
    plt.plot(times, low_pass_1000)
    plt.title("Low-Pass Filtered Signal (1000 Hz)")
    plt.xlabel("Time (s)")
    plt.ylabel("Amplitude")

    # Low-pass filtered (2000 Hz) signal plot
    plt.subplot(3, 1, 3)
    plt.plot(times, low_pass_2000)
    plt.title("Low-Pass Filtered Signal (2000 Hz)")
    plt.xlabel("Time (s)")
    plt.ylabel("Amplitude")

    plt.tight_layout()
    plt.show()

    # Display audio players with labels
    display(Markdown("**Original Audio Playback**"))
    display(Audio(audio, rate=sampling_rate))

    display(Markdown("**Low-Pass 1000 Hz Filtered Audio Playback**"))
    display(Audio(low_pass_1000, rate=sampling_rate))

    display(Markdown("**Low-Pass 2000 Hz Filtered Audio Playback**"))
    display(Audio(low_pass_2000, rate=sampling_rate))

# Call the function to plot and display audio
plot_filtered_signals_with_audio()

# Save the filtered audio to files
from scipy.io.wavfile import write
write('sound/low_pass_1000.wav', sampling_rate, low_pass_1000.astype(np.float32))
write('sound/low_pass_2000.wav', sampling_rate, low_pass_2000.astype(np.float32))
No description has been provided for this image

Original Audio Playback

Your browser does not support the audio element.

Low-Pass 1000 Hz Filtered Audio Playback

Your browser does not support the audio element.

Low-Pass 2000 Hz Filtered Audio Playback

Your browser does not support the audio element.
In [18]:
from scipy.signal import savgol_filter

# Apply Savitzky-Golay filter
def apply_savitzky_golay_filter(signal, window_length=101, polyorder=2):
    return savgol_filter(signal, window_length=window_length, polyorder=polyorder)

# Apply Savitzky-Golay filter
savitzky_golay_smoothed = apply_savitzky_golay_filter(audio)

# Plot the original and Savitzky-Golay smoothed signals
plt.figure(figsize=(12, 6))
plt.subplot(2, 1, 1)
plt.plot(times, audio)
plt.title("Original Signal")
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")

plt.subplot(2, 1, 2)
plt.plot(times, savitzky_golay_smoothed)
plt.title("Savitzky-Golay Smoothed Signal")
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")

plt.tight_layout()
plt.show()

write('sound/savitzky_golay_smoothed.wav', sampling_rate, savitzky_golay_smoothed.astype(np.float32))
No description has been provided for this image
In [19]:
from IPython.display import Markdown, Audio
import librosa.display

# Function to plot spectrogram with audio and label
def plot_spectrogram_with_audio(audio, sampling_rate, title="Spectrogram", audio_label="Audio Playback"):
    plt.figure(figsize=(10, 4))
    S = librosa.feature.melspectrogram(y=audio, sr=sampling_rate)
    S_dB = librosa.power_to_db(S, ref=np.max)
    librosa.display.specshow(S_dB, sr=sampling_rate, x_axis='time', y_axis='mel', cmap='magma')
    plt.colorbar(format='%+2.0f dB')
    plt.title(title)
    plt.tight_layout()
    
    # Display the audio label and audio player
    display(Markdown(f"**{audio_label}**"))
    display(Audio(audio, rate=sampling_rate))
    
    plt.show()

# Example: Plot original and filtered spectrograms with audio playback
plot_spectrogram_with_audio(audio, sampling_rate, title="Original Signal Spectrogram", audio_label="Original Audio Playback")
plot_spectrogram_with_audio(low_pass_1000, sampling_rate, title="Low-Pass Filtered (1kHz) Spectrogram", audio_label="Low-Pass 1000 Hz Audio Playback")
plot_spectrogram_with_audio(low_pass_2000, sampling_rate, title="Low-Pass Filtered (2kHz) Spectrogram", audio_label="Low-Pass 2000 Hz Audio Playback")
plot_spectrogram_with_audio(savitzky_golay_smoothed, sampling_rate, title="Savitzky-Golay Smoothed Spectrogram", audio_label="Savitzky-Golay Smoothed Audio Playback")

Original Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

Low-Pass 1000 Hz Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

Low-Pass 2000 Hz Audio Playback

Your browser does not support the audio element.
No description has been provided for this image

Savitzky-Golay Smoothed Audio Playback

Your browser does not support the audio element.
No description has been provided for this image
In [20]:
from IPython.display import display, Audio, Markdown
from scipy.signal import hilbert

# Function to plot envelope with audio and label
def plot_envelope_with_audio(audio, sampling_rate, title="Signal Envelope", audio_label="Audio Playback"):
    # Apply Hilbert transform to get the envelope
    analytic_signal = hilbert(audio)
    envelope = np.abs(analytic_signal)
    time_ = np.arange(len(envelope)) / sampling_rate

    # Plot the envelope
    plt.figure(figsize=(10, 4))
    plt.plot(time_, envelope)
    plt.title(title)
    plt.xlabel('Time (s)')
    plt.ylabel('Envelope Amplitude')
    plt.tight_layout()
    plt.show()

    # Display a label for the audio
    display(Markdown(f"**{audio_label}**"))
    # Embed the audio playback directly below the envelope plot
    display(Audio(audio, rate=sampling_rate))

# Example: Plot envelopes and add labeled audio playback for original and filtered signals
plot_envelope_with_audio(audio, sampling_rate, title="Original Signal Envelope", audio_label="Original Audio")
plot_envelope_with_audio(low_pass_1000, sampling_rate, title="Low-Pass Filtered (1kHz) Envelope", audio_label="Low-Pass 1000 Hz Audio")
plot_envelope_with_audio(savitzky_golay_smoothed, sampling_rate, title="Savitzky-Golay Smoothed Signal Envelope", audio_label="Savitzky-Golay Smoothed Audio")
No description has been provided for this image

Original Audio

Your browser does not support the audio element.
No description has been provided for this image

Low-Pass 1000 Hz Audio

Your browser does not support the audio element.
No description has been provided for this image

Savitzky-Golay Smoothed Audio

Your browser does not support the audio element.

Day 11¶

Measure your results using appropriate methods. Reason your choice of methods and parameters using visualizations or 1-2 sentences. Select specific measurement methods that are particularly suitable for your use case.

In [21]:
from scipy.stats import skew
from skimage.measure import shannon_entropy

# Function to calculate KPIs and convert to standard Python floats
def calculate_kpis(signal, sampling_rate):
    # Convert signal to mono if necessary
    signal_mono = librosa.to_mono(signal)
    
    # Calculate standard KPIs
    mean = float(np.mean(signal_mono))
    std_dev = float(np.std(signal_mono))
    skewness = float(skew(signal_mono))
    dynamic_range = float(np.max(signal_mono) - np.min(signal_mono))
    entropy = float(shannon_entropy(signal_mono))
    
    # Spectral features
    spectral_centroid = float(np.mean(librosa.feature.spectral_centroid(y=signal_mono, sr=sampling_rate)))
    spectral_bandwidth = float(np.mean(librosa.feature.spectral_bandwidth(y=signal_mono, sr=sampling_rate)))
    zero_crossing_rate = float(np.mean(librosa.feature.zero_crossing_rate(signal_mono)))
    
    return {
        "Mean": mean,
        "Standard Deviation": std_dev,
        "Skewness": skewness,
        "Dynamic Range": dynamic_range,
        "Entropy": entropy,
        "Spectral Centroid": spectral_centroid,
        "Spectral Bandwidth": spectral_bandwidth,
        "Zero Crossing Rate": zero_crossing_rate
    }

# Example: Calculate KPIs for original and processed signals
kpis_original = calculate_kpis(audio, sampling_rate)
kpis_shifted_3000 = calculate_kpis(shifted_3000_up, sampling_rate)
kpis_low_pass_1000 = calculate_kpis(low_pass_1000, sampling_rate)


# Prepare the KPI data
kpis_data = {
    "Original Signal": kpis_original,
    "+3000 Hz Shifted": kpis_shifted_3000,
    "Low-Pass 1000 Hz": kpis_low_pass_1000
}

# Convert the KPI data into a DataFrame for a cleaner table display
kpis_df = pd.DataFrame(kpis_data)

# Display the KPIs in table format
print(kpis_df)
                    Original Signal  +3000 Hz Shifted  Low-Pass 1000 Hz
Mean                      -0.000004          0.000003         -0.000004
Standard Deviation         0.109932          0.074204          0.014490
Skewness                   0.039047          0.038837         -0.079182
Dynamic Range              1.947723          1.586893          0.711466
Entropy                   13.321276         16.885730         16.886721
Spectral Centroid       2256.571799       2672.534128        944.300994
Spectral Bandwidth      1413.563726       1437.703945        260.682652
Zero Crossing Rate         0.161120          0.195181          0.085631
In [22]:
import matplotlib.pyplot as plt
import numpy as np

# Function to create bar charts for comparing KPIs
def plot_kpi_comparison(kpis_dict, title):
    """
    kpis_dict: Dictionary of KPIs with keys as signal names and values as KPI dictionaries.
    """
    metrics = list(kpis_dict[list(kpis_dict.keys())[0]].keys())  # Get list of KPI names
    
    # Create a figure for the bar charts
    num_metrics = len(metrics)
    fig, axs = plt.subplots(1, num_metrics, figsize=(20, 5))
    
    signals = list(kpis_dict.keys())  # List of signals (Original, Shifted, Filtered)
    
    for i, metric in enumerate(metrics):
        values = [kpis_dict[signal][metric] for signal in kpis_dict]
        axs[i].bar(signals, values)
        axs[i].set_title(metric)
        axs[i].set_ylabel('Value')
        axs[i].set_xticks(range(len(signals)))  # Set the tick positions
        axs[i].set_xticklabels(signals, rotation=45)  # Set tick labels after setting the positions
    
    fig.suptitle(title)
    plt.tight_layout()
    plt.show()

# Example: Prepare KPI data for visualization
kpis_data = {
    "Original Signal": kpis_original,
    "+3000 Hz Shifted": kpis_shifted_3000,
    "Low-Pass 1000 Hz": kpis_low_pass_1000
}

# Plot KPI comparisons
plot_kpi_comparison(kpis_data, title="KPI Comparison for Original and Processed Signals")
No description has been provided for this image

The methods we used to measure the results try to capture the impact of signal processing techniques like frequency shifting and low-pass filtering. I applied a variety of key performance indicators (KPIs) to analyze the signals both before and after processing. Specifically, i chose:

Mean and Standard Deviation:

  • These metrics provide insights into the amplitude distribution and variation of the signal, helping us understand how dynamic or smooth the sound is.

Skewness:

  • This metric is useful for detecting any asymmetry in the signal's distribution, revealing any biases introduced by shifting the frequencies. Dynamic Range: This shows how much variation exists between the loudest and softest parts of the signal, which helps in understanding how much "depth" the sound has retained or lost.

Entropy:

  • This is a measure of complexity in the signal, capturing the amount of information in both the original and processed signals. Spectral Centroid and Spectral Bandwidth: These frequency-based metrics are critical in measuring how the energy distribution and spread of frequencies shift after processing. They are directly tied to how the perceived tone and timbre of the signal change after frequency shifting and filtering.

Zero Crossing Rate: the parameters "+3000 Hz Shifted" and "Low-Pass 1000 Hz" were chosen because they were the least and most extreme.

  • This measures the number of times the signal crosses zero, which helps quantify the noisiness or smoothness of the signal. It’s especially useful in examining how the high-frequency noise has been impacted by low-pass filtering.

Day 12¶

Select the best results from your experiments. Summarize observations and interpretations about your results and discuss your findings and results in relation to your problem and use case in approximately 150 words.

We analyzed the effects of frequency shifts and low-pass filtering on the Aztec whistle to understand their impact on its unsettling nature. The +3000 Hz shift produced the most chaotic, eerie sound, amplifying the disturbing qualities of the original, making it ideal for enhancing its effect in ritualistic or dramatic settings. In contrast, the 1000 Hz low-pass filter reduced the harshness, but the whistle remained disorienting, showing that even significant filtering cannot fully neutralize its jarring characteristics. These results highlight the whistle’s acoustic resilience, as it retains its unsettling emotional impact despite modifications. Spectrograms and key performance indicators (KPIs) revealed changes in spectral centroid, bandwidth, and zero-crossing rates. While the +3000 Hz shift increased the spectral centroid and bandwidth, the 1000 Hz low-pass filter drastically reduced these values. However, despite these visual differences, the whistle’s auditory experience remained chaotic, underscoring its inherently disturbing nature across modifications.

In [23]:
time_now = time.time()
passed = time_now - start_time
print(f"Time taken: {passed:.2f} seconds")
Time taken: 229.18 seconds

1.3. Nyquist Theorem¶

Day 13¶

Work on the Nyquist theorem with one of your signals. Define a problem and use case for this purpose. Focus on the transitions of the minimally required frequency compared to insufficient information and what looks qualitatively fine. Calculate the Nyquist theorem.

Problem:¶
  • The Aztec whistle is known for its eerie and unsettling tone, and the objective is to determine how various sampling rates, particularly those below the Nyquist rate, affect the whistle's sound quality, clarity, and unsettling nature. This is especially important for understanding the trade-offs between sound quality and data storage or transmission efficiency, as lower sampling rates result in reduced file sizes but may distort the original sound.
Use Case:¶
  • The use case involves examining the effects of undersampling on the Aztec whistle signal to determine whether lower sampling rates can effectively preserve its unsettling nature for use in digital audio, such as for theatrical or gaming purposes where file size limitations exist. By understanding the minimum required sampling rate to maintain the emotional impact of the whistle, creators can optimize file storage while still delivering a chilling auditory experience. Additionally, this can be useful for research in sound design, historical reconstructions, or ritualistic contexts where the exact preservation of sound characteristics is critical.
In [24]:
def plot_frequency_spectrum_with_limited_y_range(audio, sampling_rate, y_max=600):
    # Perform FFT
    fft = np.fft.fft(audio)
    magnitude = np.abs(fft)
    frequency = np.linspace(0, sampling_rate / 2, len(magnitude) // 2)
    
    # Only take the first half (positive frequencies)
    magnitude = magnitude[:len(frequency)]
    
    # Plot the full frequency spectrum with limited y-axis
    plt.figure(figsize=(10, 6))
    plt.plot(frequency, magnitude)
    plt.title("Frequency Spectrum of Original Signal (Y-axis limited)")
    plt.xlabel("Frequency (Hz)")
    plt.ylabel("Magnitude")
    plt.ylim(0, y_max)  # Limit y-axis to the desired range
    plt.grid(True)
    plt.show()

# Example usage with a fixed y-axis limit of 50
plot_frequency_spectrum_with_limited_y_range(audio, sampling_rate)
No description has been provided for this image
In [25]:
plot_frequency_spectrum_with_limited_y_range(audio, sampling_rate, 30)
No description has been provided for this image
In [26]:
print(f"Nyquist Frequency: {sampling_rate / 2} Hz")
Nyquist Frequency: 11025.0 Hz

Calculate the Nyquist theorem:

I am not entirely sure if it is expected to calculate the maximum frequency we can represent given the sample rate, which would be 11025 Hz, or if we are expected to find the appropriate sample rate for our audio. In that case, it would be 12000 Hz. This number comes from observing the plots and assuming the frequencies above 6000 Hz are noise. This also corresponds with online research, which indicates that the whistle goes up to that spectrum. https://www.mexicolore.co.uk/aztecs/music/death-whistle

In [27]:
audio_6000_sampling_rate = 12000
audio_6000 = librosa.resample(audio, orig_sr=sampling_rate, target_sr=audio_6000_sampling_rate)
display(Markdown(f"**original with sample rate {sampling_rate}**"))
display(Audio(audio, rate=sampling_rate))

display(Markdown(f"**Sample Rate {audio_6000_sampling_rate}**"))
display(Audio(audio_6000, rate=audio_6000_sampling_rate))

original with sample rate 22050

Your browser does not support the audio element.

Sample Rate 12000

Your browser does not support the audio element.
In [28]:
# Function to plot the frequency spectrum
def plot_frequency_spectrum(audio, sampling_rate, title="Frequency Spectrum"):
    fft = np.fft.fft(audio)
    magnitude = np.abs(fft)
    frequency = np.linspace(0, sampling_rate / 2, len(magnitude) // 2)
    
    plt.figure(figsize=(10, 6))
    plt.plot(frequency, magnitude[:len(frequency)])
    plt.title(title)
    plt.xlabel("Frequency (Hz)")
    plt.ylabel("Magnitude")
    plt.show()
    display(Markdown(f"**Sample Rate: {sampling_rate}**"))
    display(Audio(audio, rate=sampling_rate))

plot_frequency_spectrum(audio_6000, audio_6000_sampling_rate, "Frequency Spectrum of Resampled Signal (6000 Hz)")
No description has been provided for this image

Sample Rate: 12000

Your browser does not support the audio element.

Day 14¶

Demonstrate the Nyquist theorem at suitable foints with appropriate methods, visualizations, etc. Focus on the transitions of the minimally required frequency compared to insufficient information and what looks qualitatively fine.

In [29]:
# Resample the audio to 4000 Hz and 2000 Hz
audio_4000 = librosa.resample(audio_6000, orig_sr=audio_6000_sampling_rate, target_sr=4000)
audio_3000 = librosa.resample(audio_6000, orig_sr=audio_6000_sampling_rate, target_sr=3000)

# Plot frequency spectra for 4000 Hz and 2000 Hz resampled audio
plot_frequency_spectrum(audio_4000, 4000, "Frequency Spectrum at 4000 Hz Sample Rate, resampled")
plot_frequency_spectrum(audio_3000, 3000, "Frequency Spectrum at 3000 Hz Sample Rate, resampled")
No description has been provided for this image

Sample Rate: 4000

Your browser does not support the audio element.
No description has been provided for this image

Sample Rate: 3000

Your browser does not support the audio element.
In [30]:
def simulate_insufficient_data(audio, skip_factor):
    # Skip every nth sample
    skipped_audio = audio[::skip_factor]
    return skipped_audio

# Simulate insufficient data by skipping samples (keeping 1 in every 2 and 1 in every 4 samples)
insufficient_data_audio_2x = simulate_insufficient_data(audio_6000, skip_factor=2)
insufficient_data_audio_4x = simulate_insufficient_data(audio_6000, skip_factor=4)

# Plot the frequency spectrum for the original, 2x skipped, and 4x skipped signals
plot_frequency_spectrum(audio_6000, audio_6000_sampling_rate, title="Frequency Spectrum of Original Signal")
plot_frequency_spectrum(insufficient_data_audio_2x, audio_6000_sampling_rate//2, title="Frequency Spectrum of Signal with 2x Skipped Data")
plot_frequency_spectrum(insufficient_data_audio_4x, audio_6000_sampling_rate//4, title="Frequency Spectrum of Signal with 4x Skipped Data")
No description has been provided for this image

Sample Rate: 12000

Your browser does not support the audio element.
No description has been provided for this image

Sample Rate: 6000

Your browser does not support the audio element.
No description has been provided for this image

Sample Rate: 3000

Your browser does not support the audio element.

We conducted experiments by both resampling the Aztec whistle audio signal and skipping data points to observe the impact of violating the Nyquist theorem. When skipping samples, the sampling rate was effectively reduced without filtering out the high-frequency components, which resulted in aliasing. This caused higher frequency components to "fold" into lower frequencies, significantly increasing the magnitudes in the frequency spectrum. This effect can be observed in the frequency spectrum where the skipped data results show higher and inaccurate magnitudes compared to the resampled data.

On the other hand, resampling the signal according to the Nyquist theorem provided a cleaner result. Frequencies above the Nyquist frequency were filtered out, ensuring that the frequency content of the signal was accurately captured. The comparison highlights how aliasing distorts the signal when insufficient sampling is applied, reaffirming the importance of applying the Nyquist theorem in signal processing to maintain data integrity.

when we listen to the 2 samples we can also hear a big difference where correct resampling gives a much cleaner result.

the plot below visulizes that behaviour perfectly.

In [31]:
import numpy as np
import matplotlib.pyplot as plt

# Define the sine wave function
def sine_wave(t, amplitude, frequency):
    return amplitude * np.sin(2 * np.pi * frequency * t)

# Set fixed sample rate and frequency
sr = 200
t = np.linspace(0, 1, sr)
sample_frequency = 15  # Fixed sample rate
frequency = 11  # Fixed frequency

# Generate the sine wave and sampled points
sine_w = sine_wave(t, 1, frequency=frequency)

# Create the figure and plot the sine wave
fig, ax = plt.subplots()
line, = ax.plot(t, sine_w, lw=2)
points, = ax.plot(t[::int(sr/sample_frequency)], sine_w[::int(sr/sample_frequency)], 'ro-', lw=1)

# Set plot limits and labels
ax.set_ylim(-1.5, 1.5)
ax.set_xlabel('Time [s]')
ax.set_title("Sample Rate Visualization")

# Show the plot
plt.show()
No description has been provided for this image

Day 15¶

Critically discuss your choice of data, parameters, and methods, as well as the achieved results, in approximately 150 words in relation to your problem statement and use case.

In our experiments, we worked with the Aztec death whistle signal to explore how different sampling rates affect signal fidelity and adherence to the Nyquist theorem. The chosen methods, particularly resampling and skipping data, demonstrated the impact of undersampling on frequency resolution. While the Nyquist theorem was respected during resampling, skipping data led to aliasing, causing high frequencies to incorrectly map into lower frequencies. This underscores the importance of adhering to the Nyquist criterion when preserving original signal characteristics. The Aztec whistle’s unsettling nature was less affected by filtering, confirming its resilience in producing dissonant sounds even under modification. However, when critical data points were skipped, the high-frequency components were lost or misrepresented, altering the emotional intensity of the sound. The methods used were well-suited to the use case, and visualizations, such as frequency spectrums, provided clear insights into how undersampling corrupts signals.

2. Convolution/Filtering in Image and Signal (LE2)¶

2.1. Filtering in the Spatial Domain¶

Day 16¶

Implement a classic algorithm for filtering either signals (1D) or images (2D) in the spatial domain (convolution). What elements should such an algorithm contain?

In [32]:
import numpy as np

def apply_convolution(image, kernel):
    # Get the dimensions of the kernel
    k_height, k_width = kernel.shape
    
    # Check if the image is grayscale or colored
    if image.ndim == 2:
        # Grayscale image
        img_height, img_width = image.shape
        num_channels = 1  # Only one channel for grayscale
        image = image[:, :, np.newaxis]  # Add a channel dimension
    else:
        # Colored image
        img_height, img_width, num_channels = image.shape
    
    # Create an output image to store the results
    output_image = np.zeros((img_height, img_width, num_channels))
    
    # Padding the image to handle borders
    pad_height = k_height // 2
    pad_width = k_width // 2
    padded_image = np.pad(
        image,
        ((pad_height, pad_height), (pad_width, pad_width), (0, 0)),
        mode='constant',
        constant_values=0
    )
    
    # Perform convolution for each channel
    for c in range(num_channels):
        for i in range(img_height):
            for j in range(img_width):
                # Extract the region of interest for this channel
                region = padded_image[i:i + k_height, j:j + k_width, c]
                # Perform element-wise multiplication and sum the result
                output_image[i, j, c] = np.sum(region * kernel)
    
    # If the original image was grayscale, remove the extra channel dimension
    if num_channels == 1:
        output_image = output_image[:, :, 0]
    
    # Normalize the output to be in the range of 0 to 255
    output_image = np.clip(output_image, 0, 255)
    
    return output_image.astype(np.uint8)

Here’s a bullet-point list of the key elements a convolution algorithm should contain for processing signals or images in the spatial domain:

  • Input Data:

    • Image or signal (in 1D or 2D form, typically a numpy array).
  • Kernel (Filter):

    • The filter (kernel) matrix that defines the transformation. For image processing, common kernels include blurring (e.g., Gaussian), sharpening, edge detection (e.g., Sobel).
    • Size of the kernel (typically 3x3, 5x5, etc., for images).
  • Sliding Window:

    • For each pixel (or element in 1D), the algorithm slides the kernel across the image/signal, computing a new value for the central element based on the convolution.
  • Padding:

    • Handling the image/signal boundaries by either padding with zeros, mirroring the data, or using a custom padding scheme to ensure that the kernel can be applied at the edges.
  • Element-wise Multiplication:

    • For each pixel (or point in 1D), the convolution involves element-wise multiplication between the kernel and the corresponding neighborhood (sub-region of the image).
  • Summation of Products:

    • After multiplying, the sum of these products is calculated and assigned as the new value of the central pixel (or point) in the output.
  • Normalization (optional):

    • If the kernel values are not normalized, the result can be normalized to avoid extreme pixel intensities (for example, dividing by the sum of the kernel values in a smoothing filter).
  • Boundary Handling:

    • Decide how to handle pixels or points on the edges of the image/signal, whether by extending the image, mirroring, or zero-padding.
  • Output:

    • A filtered image or signal with the convolution applied.

Day 17¶

Test your function with a signal or an image related to your country and convolution kernel(s) for smoothing of appropriate size. What problem do you want to address/solve for which use case?

awesome Image

The problem we aim to address is the presence of stripes and noise in the bokeh (blurred background) of the jellyfish image. These artifacts distract from the smooth, aesthetically pleasing blur typically associated with bokeh. To improve the visual quality of the image, we will apply smoothing techniques such as Gaussian blur kernels or median filters. These filters should help eliminate the unwanted stripes and noise while mostly preserving the important details of the image, such as the jellyfish itself. This process enhances the overall visual appeal of the subject, making it more suitable for artistic, or presentation purposes.

In [33]:
mean_kernel = np.ones((5, 5)) / 25  # 5x5 mean filter

gaussian_kernel = np.array([[1, 4, 7, 4, 1],
                            [4, 16, 26, 16, 4],
                            [7, 26, 41, 26, 7],
                            [4, 16, 26, 16, 4],
                            [1, 4, 7, 4, 1]]) / 273  # Gaussian blur kernel

# Larger Mean blur kernel (9x9)
mean_kernel_large = np.ones((9, 9)) / 81

# Larger Gaussian blur kernel (9x9) - approximated manually
gaussian_kernel_large = np.array([[1, 2, 3, 4, 5, 4, 3, 2, 1],
                                  [2, 4, 6, 8, 10, 8, 6, 4, 2],
                                  [3, 6, 9, 12, 15, 12, 9, 6, 3],
                                  [4, 8, 12, 16, 20, 16, 12, 8, 4],
                                  [5, 10, 15, 20, 25, 20, 15, 10, 5],
                                  [4, 8, 12, 16, 20, 16, 12, 8, 4],
                                  [3, 6, 9, 12, 15, 12, 9, 6, 3],
                                  [2, 4, 6, 8, 10, 8, 6, 4, 2],
                                  [1, 2, 3, 4, 5, 4, 3, 2, 1]]) / 285
In [34]:
smoth_jelly_s = apply_convolution(image=jellyfish, kernel=mean_kernel)
In [35]:
smoth_jelly = apply_convolution(image=jellyfish, kernel=mean_kernel_large)
gaussian_jelly_s = apply_convolution(image=jellyfish, kernel=gaussian_kernel)
gaussian_jelly = apply_convolution(image=jellyfish, kernel=gaussian_kernel_large)
In [36]:
# Function to zoom into a specific region of the image
def zoom_in(image, center_x, center_y, zoom_factor=0.5):
    height, width, _ = image.shape
    half_height = int((height * zoom_factor) / 2)
    half_width = int((width * zoom_factor) / 2)

    top_left_x = max(0, center_x - half_width)
    top_left_y = max(0, center_y - half_height)
    bottom_right_x = min(width, center_x + half_width)
    bottom_right_y = min(height, center_y + half_height)

    return image[top_left_y:bottom_right_y, top_left_x:bottom_right_x]

# Choose the center of the image to zoom in
center_x, center_y = jellyfish.shape[1] // 2, jellyfish.shape[0] // 2

# Apply zoom
zoomed_original = zoom_in(jellyfish, center_x, center_y, zoom_factor=0.3)
zoomed_mean_blur = zoom_in(smoth_jelly, center_x, center_y, zoom_factor=0.3)
zoomed_mean_blur_s = zoom_in(smoth_jelly_s, center_x, center_y, zoom_factor=0.3)
zoomed_gaussian_blur_s = zoom_in(gaussian_jelly_s, center_x, center_y, zoom_factor=0.3)
zoomed_gaussian_blur = zoom_in(gaussian_jelly, center_x, center_y, zoom_factor=0.3)

# Plot the zoomed-in images
plt.figure(figsize=(16, 10))

plt.subplot(2, 3, 1)
plt.title("Original (Zoomed)")
plt.imshow(zoomed_original)
plt.axis('off')

plt.subplot(2, 3, 2)
plt.title("Mean Blur (Small, Zoomed)")
plt.imshow(zoomed_mean_blur_s)
plt.axis('off')

plt.subplot(2, 3, 3)
plt.title("Gaussian Blur (Small, Zoomed)")
plt.imshow(zoomed_gaussian_blur_s)
plt.axis('off')

plt.subplot(2, 3, 4)
plt.title("Original (Zoomed)")
plt.imshow(zoomed_original)
plt.axis('off')

plt.subplot(2, 3, 5)
plt.title("Mean Blur (Large, Zoomed)")
plt.imshow(zoomed_mean_blur)
plt.axis('off')

plt.subplot(2, 3, 6)
plt.title("Gaussian Blur (Large, Zoomed)")
plt.imshow(zoomed_gaussian_blur)
plt.axis('off')

plt.tight_layout()
plt.show()
No description has been provided for this image

Day 18¶

Test your function with a signal or an image related to your country and convolution kernel(s) for denoising of appropriate size. What problem do you want to address/solve for which use case?

In [37]:
mean_kernel__5x5 = np.ones((5, 5)) / 25.0

mean_kernel__3x3 = np.ones((3, 3)) / 9.0

gaussian_kernel_5x5 = np.array([[1, 4, 6, 4, 1],
                                [4, 16, 24, 16, 4],
                                [6, 24, 36, 24, 6],
                                [4, 16, 24, 16, 4],
                                [1, 4, 6, 4, 1]]) / 256.0

gaussian_kernel_3x3 = np.array([[1, 2, 1],
                                [2, 4, 2],
                                [1, 2, 1]]) / 16.0

gaussian_kernel_9x9 = gaussian_kernel_large

change_detection = np.array([[0, 1, 0],
                             [1, -4, 1],
                             [0, 1, 0]]) * 2

i = apply_convolution(kernel=change_detection, image=cavediving)
change_image = np.abs(i) / np.max(i)
g9x9 = apply_convolution(kernel=gaussian_kernel_9x9, image=cavediving)
g5x5 = apply_convolution(kernel=gaussian_kernel_5x5, image=cavediving)

min_change_factor = 0.4
denoised9x9 = np.where(change_image > min_change_factor, cavediving, g9x9)
denoised5x5 = np.where(change_image > min_change_factor, cavediving, g5x5)
In [38]:
center_x, center_y = 2500, 2200
factor = 0.1
zoomed_x_times = 100 / factor
zoomed_original = zoom_in(cavediving, center_x, center_y, zoom_factor=factor)
zoomed_gaus_9x9 = zoom_in(g9x9, center_x, center_y, zoom_factor=factor)
zoomed_gaus_5x5 = zoom_in(g5x5, center_x, center_y, zoom_factor=factor)
zoomed_denoised9x9 = zoom_in(denoised9x9, center_x, center_y, zoom_factor=factor)
zoomed_denoised5x5 = zoom_in(denoised5x5, center_x, center_y, zoom_factor=factor)
zoomed_change_image = zoom_in(change_image, center_x, center_y, zoom_factor=factor)

less_zoom = 3
zoomed_denoised9x9_less_zoom = zoom_in(denoised9x9, center_x, center_y, zoom_factor=factor*less_zoom)
zoomed_denoised5x5_less_zoom = zoom_in(denoised5x5, center_x, center_y, zoom_factor=factor*less_zoom)

# Plot the zoomed-in images
plt.figure(figsize=(15,20))

plt.subplot(4, 2, 1)
plt.title(f"Original (Zoomed {zoomed_x_times} %)")
plt.imshow(zoomed_original)
plt.axis('off')

plt.subplot(4, 2, 2)
plt.title(f"Change detection Kernel (Zoomed {zoomed_x_times} %)")
plt.imshow(zoomed_change_image)
plt.axis('off')

plt.subplot(4, 2, 3)
plt.title(f"Gaus Kernel (9x9, Zoomed {zoomed_x_times} %)")
plt.imshow(zoomed_gaus_9x9)
plt.axis('off')

plt.subplot(4, 2, 4)
plt.title(f"Gaus Kernel (5x5, Zoomed {zoomed_x_times} %)")
plt.imshow(zoomed_gaus_5x5)
plt.axis('off')

plt.subplot(4, 2, 5)
plt.title(f"Gaus and change detection (9x9, Zoomed {zoomed_x_times} %)")
plt.imshow(zoomed_denoised9x9)
plt.axis('off')

plt.subplot(4, 2, 6)
plt.title(f"Gaus and change detection (5x5, Zoomed {zoomed_x_times} %)")
plt.imshow(zoomed_denoised5x5)
plt.axis('off')

plt.subplot(4, 2, 7)
plt.title(f"Gaus and change detection (9x9, Zoomed {zoomed_x_times//less_zoom} %)")
plt.imshow(zoomed_denoised9x9_less_zoom)
plt.axis('off')

plt.subplot(4, 2, 8)
plt.title(f"Gaus and change detection (5x5, Zoomed {zoomed_x_times//less_zoom} %)")
plt.imshow(zoomed_denoised5x5_less_zoom)
plt.axis('off')
Out[38]:
(np.float64(-0.5), np.float64(1639.5), np.float64(1093.5), np.float64(-0.5))
No description has been provided for this image

For this task, we focused on solving the issue of noise in underwater cave-diving images taken in low-light conditions. The goal was to denoise the image while preserving important details, such as edges and lighting, through the application of convolution filters. The use case involves cleaning up noise from underwater photography, which is inherently challenging due to low light and the presence of particles in the water.

The objective was to apply various denoising filters to the image and compare their effects. We implemented Gaussian filters of different sizes (3x,3 5x5, 9x9), as well as a change detection kernel. By combining these, we ensured that areas of significant change were preserved while smoothing out noise in less important regions. This approach is relevant because it addresses the need to enhance underwater photos for better visibility while maintaining critical details. The selected image was from a cave-diving expedition from when i was doing the scuba instructor training in mexico.

several values for min_change_factor, multiplication of min_change_factor and kernel sizes were used to experiment, mean kernel in differnt sizes was also used, from the values and kernels i experimented with, led the ones in the current notebook to the bet results.

In [39]:
before = cavediving
after = denoised5x5

Day 19¶

Measure the differences in your data before and after filtering using a quantitative method. Choose specific methods that are particularly suitable for your use case.

In [40]:
from skimage.metrics import mean_squared_error, structural_similarity as ssim
from math import log10
from scipy.stats import entropy

def calculate_kpis(before_image, after_image):
    # Convert images to float for calculation
    before = before_image.astype(np.float32)
    after = after_image.astype(np.float32)

    # 1. Mean Absolute Error (MAE)
    mae = np.mean(np.abs(before - after))

    # 2. Peak Signal-to-Noise Ratio (PSNR)
    mse = mean_squared_error(before, after)
    psnr = 10 * log10(255**2 / mse) if mse != 0 else float('inf')

    # 3. Structural Similarity Index (SSIM) with color images, specify channel_axis
    ssim_value, _ = ssim(before, after, full=True, channel_axis=2, data_range=255)

    # 4. Standard Deviation
    std_before = np.std(before)
    std_after = np.std(after)

    # 5. Entropy
    hist_before, _ = np.histogram(before, bins=256, range=(0, 256), density=True)
    hist_after, _ = np.histogram(after, bins=256, range=(0, 256), density=True)
    entropy_before = entropy(hist_before)
    entropy_after = entropy(hist_after)

    # Print the results
    print(f"Mean Absolute Error (MAE): {mae:.4f}")
    print(f"Peak Signal-to-Noise Ratio (PSNR): {psnr:.4f} dB")
    print(f"Structural Similarity Index (SSIM): {ssim_value:.4f}")
    print(f"Standard Deviation (Before): {std_before:.4f}, Standard Deviation (After): {std_after:.4f}")
    print(f"Entropy (Before): {entropy_before:.4f}, Entropy (After): {entropy_after:.4f}")

    # Return KPIs as a dictionary
    return {
        "MAE": mae,
        "PSNR": psnr,
        "SSIM": ssim_value,
        "Std Before": std_before,
        "Std After": std_after,
        "Entropy Before": entropy_before,
        "Entropy After": entropy_after
}

calculate_kpis(before, after)
Mean Absolute Error (MAE): 0.7061
Peak Signal-to-Noise Ratio (PSNR): 45.5993 dB
Structural Similarity Index (SSIM): 0.9809
Standard Deviation (Before): 38.5319, Standard Deviation (After): 38.4751
Entropy (Before): 3.9105, Entropy (After): 3.9270
Out[40]:
{'MAE': np.float32(0.70606565),
 'PSNR': 45.59928961893503,
 'SSIM': np.float32(0.9809079),
 'Std Before': np.float32(38.531883),
 'Std After': np.float32(38.475067),
 'Entropy Before': np.float64(3.910515699124992),
 'Entropy After': np.float64(3.926954480666715)}

the KPI's usecase and explanation:

  1. Mean Absolute Error (MAE): 0.7061

    • MAE measures the average absolute difference between the pixel values of the two images (before and after filtering). In this case, a low value (0.7061) indicates that the filtered image is very close to the original, meaning the filtering process did not drastically alter the image on average.
  2. Peak Signal-to-Noise Ratio (PSNR): 45.5993 dB

    • PSNR quantifies the ratio between the maximum possible value of a signal (the image) and the noise introduced by the filtering process. Higher PSNR values indicate that the filtered image retains more similarity to the original (less noise). A value above 40 dB is typically considered excellent, meaning the filtering introduced very little noise, and the image quality is largely preserved.
  3. Structural Similarity Index (SSIM): 0.9809

    • SSIM measures the perceptual similarity between the original and filtered images, considering factors like luminance, contrast, and structure. SSIM values range from 0 to 1, where 1 indicates perfect similarity. A value of 0.9809 suggests that the filtered image is almost identical to the original from a structural and perceptual standpoint.
  4. Standard Deviation (Before): 38.5319, Standard Deviation (After): 38.4751

    • Standard deviation measures the contrast and variability of pixel intensities within the image. The slight reduction in standard deviation after filtering (from 38.5319 to 38.4751) suggests that the filter mildly smoothed the image, reducing minor intensity variations (like noise) but not drastically changing the overall contrast.
  5. Entropy (Before): 3.9105, Entropy (After): 3.9270

    • Entropy measures the amount of information or randomness in an image, often linked to texture and detail. The entropy increased slightly after filtering (from 3.9105 to 3.9270), indicating that the filtered image may have slightly more fine details or subtle variations in pixel intensity. This could be due to how the filter enhanced certain features of the image while reducing noise.

Day 20¶

Discuss your choice of data, methods, and parameters, as well as the achieved results, in approximately 150 words in relation to your problem statement and use case. Why did you choose these convolution kernels and these quantitative methods?

For this task, i used a cave-diving image taken in a dark environment to test the effectiveness of denoising techniques. We applied Gaussian filters (5x5 and 9x9 kernels) combined with a change detection kernel to preserve details while reducing noise. The Gaussian filter was chosen because of its well-established use in image denoising, providing a balance between blurring and retaining essential features. The change detection kernel helped isolate areas with minimal change, allowing for selective noise reduction. We evaluated the results using key performance indicators (KPIs) such as Mean Absolute Error (MAE), Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), Standard Deviation, and Entropy. These metrics were specifically chosen to assess image quality, noise reduction, and preservation of structural integrity. The results show that the larger Gaussian kernel (9x9) resulted in greater noise reduction, but at the cost of some image detail, while the 5x5 kernel preserved more fine details, providing a better balance for this use case.

I belive this approach was particularly suitable for the cave-diving image, where high noise levels were present due to low lighting conditions. The visualizations demonstrated clear improvements in smoothness , while the KPIs and visualizations indicated minimal loss in structural similarity and acceptable levels of noise reduction, which are critical for enhancing underwater images.

In [41]:
time_now = time.time()
passed = time_now - start_time
print(f"Time taken: {passed:.2f} seconds")
Time taken: 280.25 seconds

2.2. Filtering in the Spectral Domain¶

Day 21¶

Implement a method based on spectral filtering to filter out a repetitive pattern (circles) from either a signal or an image.

In [42]:
def remove_circles(
    image_path,
    dp=1,
    minDist=20,
    param1=50,
    param2=60,
    minRadius=3,
    maxRadius=70,
    inpaintRadius=3,
    inpaintMethod=cv.INPAINT_NS,
    show_circles=False,
    circle_size_increase=0
):
    """
    Detects circles in an image and removes them using inpainting.

    Parameters:
    - image_path (str): Path to the input image.
    - dp (float): Inverse ratio of the accumulator resolution to the image resolution.
    - minDist (float): Minimum distance between the centers of detected circles.
    - param1 (float): Higher threshold for the Canny edge detector.
    - param2 (float): Accumulator threshold for the circle centers at the detection stage.
    - minRadius (int): Minimum circle radius.
    - maxRadius (int): Maximum circle radius.
    - inpaintRadius (int): Radius of a circular neighborhood of each point inpainted.
    - inpaintMethod (flag): Inpainting method (cv.INPAINT_TELEA or cv.INPAINT_NS).
    - show_circles (bool): If True, displays the image with detected circles drawn.

    Returns:
    - img_inpainted (numpy.ndarray): The image with circles removed.
    """
    # Load the image
    img = cv.imread(image_path)
    if img is None:
        print("Error: Image not found or unable to open.")
        return None

    # Convert to grayscale
    gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

    # Apply Hough transform to detect circles
    detected_circles = cv.HoughCircles(
        gray,
        cv.HOUGH_GRADIENT,
        dp=dp,
        minDist=minDist,
        param1=param1,
        param2=param2,
        minRadius=minRadius,
        maxRadius=maxRadius
    )

    # Create a mask to cover the detected circles
    mask = np.zeros_like(gray)

    if detected_circles is not None:
        # Convert the circle parameters a, b, r to integers
        detected_circles = np.uint16(np.around(detected_circles))

        # If show_circles is True, make a copy of the image to draw circles
        if show_circles:
            img_with_circles = img.copy()

        for pt in detected_circles[0, :]:
            a, b, r = pt[0], pt[1], pt[2]

            # Draw filled circles on the mask
            cv.circle(mask, (a, b), r+circle_size_increase, 255, thickness=-1)

            if show_circles:
                # Draw the circumference of the circle
                cv.circle(img_with_circles, (a, b), r, (0, 255, 0), 2)
                # Draw a small circle to show the center
                cv.circle(img_with_circles, (a, b), 1, (0, 0, 255), 3)

        # Inpaint the image using the mask
        img_inpainted = cv.inpaint(img, mask, inpaintRadius=inpaintRadius, flags=inpaintMethod)

        # Display the original image and the image after inpainting
        plt.figure(figsize=(20, 20))

        plt.subplot(1, 2, 1)
        plt.imshow(cv.cvtColor(img, cv.COLOR_BGR2RGB))
        plt.title("Original Image with Detected Circles")
        plt.axis("off")

        plt.subplot(1, 2, 2)
        plt.imshow(cv.cvtColor(img_inpainted, cv.COLOR_BGR2RGB))
        plt.title("Image after Removing Circles")
        plt.axis("off")

        plt.show()

        if show_circles:
            # Display the image with detected circles
            plt.figure(figsize=(10, 10))
            plt.imshow(cv.cvtColor(img_with_circles, cv.COLOR_BGR2RGB))
            plt.title("Image with Detected Circles")
            plt.axis("off")
            plt.show()
    else:
        print("No circles were found.")
        img_inpainted = img.copy()

    return img_inpainted

result = remove_circles(
    image_path="img/geometric-drawing-art.webp",
    show_circles=True,
    circle_size_increase=5
)
No description has been provided for this image
No description has been provided for this image
In [43]:
result = remove_circles(
    image_path="img/bouys.png",
    dp=1,
    minDist=20,
    param1=40,
    param2=17,
    minRadius=1,
    maxRadius=10,
    inpaintRadius=3,
    inpaintMethod=cv.INPAINT_NS,
    circle_size_increase=5
)
No description has been provided for this image

For Day 22, the objective is to remove circular objects from an image related to my country using image processing techniques. The image features a scene with ocean buoys, which I want to clean by removing these repetitive circular patterns.

The problem to address is the removal of buoys from the image while preserving the surrounding scenery. The use case is applicable in scenarios like image editing, where unwanted objects (in this case, the buoys) need to be removed without affecting other areas of the image.

Through multiple trials, I adjusted parameters such as the Hough Transform's sensitivity for circle detection (param1, param2) and circle radius constraints. The inpainting radius (inpaintRadius) was also refined for a smoother restoration of the image background once the buoys were detected and removed. The results show that some buoys were successfully removed, while a few remain, indicating that the method effectively handles certain circles but requires further tuning to capture smaller or more ambiguous ones.

The objective of this task was to find the best balance of parameters to remove the majority of the buoys without significantly disrupting the background scene.

Day 23¶

Discuss your choice of data, methods, and parameters, as well as the achieved results, in approximately 150 words in relation to your problem statement and use case.

The data used, a photograph of an ocean scene with buoys in Playa del Carmen, which is known for its beautyfull beaches. The buoys serve as a good test case for detecting and eliminating circles while preserving the water and sky. The methods involved using Hough Circle Transform for detection and inpainting to fill in the areas where buoys were removed. Parameters such as dp, param1, param2, and inpaintRadius were adjusted through trial and error to optimize detection without over-effecting other image details.

The results indicate that while some buoys were removed successfully, others remained, demonstrating the limits of the current function and configuration in handling smaller and less distinct circles. Further refinement of detection parameters might help, but the method seems suitable for this use case, where the goal is to remove repetitive circular objects. The visualizations equaly show the effectiveness of the method and areas for improvement.

extra fact:

This beach is also famous for its bullshark diving spots, every year the biggest bullsharks living in the Gulf of Mexico and the Caribbean come here to give birth.

picture from below the bouyes:

awesome Image

2.3. Algorithms for Structure Detection in Images¶

Day 24¶

Choose a suitable example image from your country to demonstrate the individual steps of a well-known classical image processing algorithm for detecting lines. Define a problem statement and a use case for your task.

Ancient Inka Stone Carving from Tikal¶

awesome Image

For this task, I chose an image of an ancient stone carving from Mexico as a representation for demonstrating line detection techniques. The image captures intricate stonework, with various patterns and designs that are well-defined in some places but also weathered over time. The goal of this experiment is to apply a classical edge detection algorithm, to identify and highlight the lines in the carving, which can assist in the preservation or digital restoration of historical artifacts.

Problem Statement:
The problem to be addressed is detecting and enhancing the carved lines of ancient stonework for preservation purposes. Over time, erosion has made some lines less visible, and digital restoration/preservation is essential to preserve historical information.

Use Case:
The use case is specifically focused on cultural heritage preservation, where ancient structures and artifacts are susceptible to wear. The line detection algorithm can assist experts in automatically identifying and enhancing faded or eroded patterns, making restoration efforts more efficient.

Objective:
The objective is to apply a well-known classical algorithm, to accurately identify the contours and lines in the carving. By applying this algorithm, we aim to demonstrate the steps and importance of line detection in preserving detailed historical artifacts digitally.

The pictue was taken by me

Day 25¶

Describe what is conceptually relevant in each individual step of your detection algorithm for lines.

Conceptual Description of Each Step:

Grayscale Conversion:

  • Line detection algorithms like Canny or Hough Transform work more effectively on single-channel images. By converting the image to grayscale, we remove color information and focus purely on intensity (brightness) variations, which simplifies the process of detecting lines or edges based on changes in light intensity.

Gaussian Blurring:

  • Noise in images can cause unwanted or false edge detection. Gaussian blurring is used to reduce this noise by smoothing the image. This helps to avoid detecting small-scale intensity changes that are not actual lines, improving the reliability of the edge detection step.

Gradient Calculation:

  • To detect edges (and ultimately lines), we need to calculate the intensity gradient, which identifies areas of rapid intensity change. The gradient can be calculated using techniques like the Sobel filter, which computes the first derivatives of the image in both the horizontal (x) and vertical (y) directions. This step detects where the most significant intensity changes occur, marking potential edges.

Non-Maximum Suppression:

  • After calculating the gradient, edges will often be wide and blurry. Non-maximum suppression helps to thin out the edges by retaining only the pixels that represent the maximum gradient magnitude in the direction of the edge. This step ensures that the detected lines are as sharp and thin as possible.

Double Thresholding:

  • This step helps differentiate between strong edges, weak edges, and non-edges by applying two thresholds: high and low. Strong edges are clearly part of the lines, weak edges may or may not be, and non-edges are discarded. This helps distinguish between actual lines and noise or irrelevant details in the image.

Edge Tracking by Hysteresis:

  • After applying the double threshold, weak edges that are connected to strong edges are retained, while other weak edges are discarded. This final step helps ensure that only continuous, meaningful lines are detected and fragmented, weak edges are filtered out.

Day 26¶

Implement each step of your detection algorithm for {lines}. The individual steps of the algorithm can be self-programmed or you may use libraries. However, is important that the relevant intermediate results of the steps can be demonstrated (see next taks).

In [44]:
image_path="img/geometric-drawing-art.webp"
image = cv.imread(image_path)
plt.imshow(cv.cvtColor(image, cv.COLOR_BGR2RGB))
Out[44]:
<matplotlib.image.AxesImage at 0x73348ff37d90>
No description has been provided for this image
In [45]:
# Step 1: Grayscale conversion
gray_image = cv.cvtColor(image, cv.COLOR_BGR2GRAY)
plt.figure(figsize=(12, 8))
plt.imshow(gray_image, cmap='gray')
plt.title('Grayscale Conversion')
plt.axis('off')
plt.show()
No description has been provided for this image
In [46]:
# Step 2: Gaussian Blurring to reduce noise
blur_kernel_size=5
blur_sigma=0

blurred_image = cv.GaussianBlur(src=gray_image, ksize=(blur_kernel_size,blur_kernel_size), sigmaX=blur_sigma)
plt.figure(figsize=(12, 8))
plt.imshow(blurred_image, cmap='gray')
plt.title(f'Gaussian Blur ({blur_kernel_size}*{blur_kernel_size}, σ={blur_sigma})')
plt.axis('off')
Out[46]:
(np.float64(-0.5), np.float64(1023.5), np.float64(1023.5), np.float64(-0.5))
No description has been provided for this image
In [47]:
# Step 3: Gradient Calculation using Sobel filters
ksize=3
grad_x = cv.Sobel(blurred_image, cv.CV_64F, 1, 0, ksize=ksize)
grad_y = cv.Sobel(blurred_image, cv.CV_64F, 0, 1, ksize=ksize)
gradient_magnitude = (np.abs(grad_x) + np.abs(grad_y)) / 2
gradient_direction = np.arctan2(grad_y, grad_x) * (180 / np.pi)

plt.figure(figsize=(20, 10))

plt.subplot(1, 2, 1)
plt.title('Sobel Gradient X')
plt.imshow(grad_x)
plt.axis('off')

plt.subplot(1, 2, 2)
plt.title('Sobel Gradient Y')
plt.imshow(grad_y)
plt.axis('off')
plt.show()

plt.figure(figsize=(12, 8))
plt.title('Gradient Magnitude')
plt.imshow(gradient_magnitude, cmap='gray')
plt.axis('off')
plt.show()
No description has been provided for this image
No description has been provided for this image
In [48]:
# Step 4: Non-Maximum Suppression (detailed for 8 directions)
nms_image = np.zeros_like(gradient_magnitude)
angle = gradient_direction % 180  # Constrain angles between 0° and 180°

for i in range(1, gray_image.shape[0] - 1):
    for j in range(1, gray_image.shape[1] - 1):
        # 0 degrees (horizontal)
        if (0 <= angle[i, j] < 22.5) or (157.5 <= angle[i, j] <= 180):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i, j + 1]) and (gradient_magnitude[i, j] >= gradient_magnitude[i, j - 1]):
                nms_image[i, j] = gradient_magnitude[i, j]
        
        # 45 degrees (horizontal)
        if (22.5 <= angle[i, j] < 65.5):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j + 1]) and (gradient_magnitude[i, j] >= gradient_magnitude[i - 1, j - 1]):
                nms_image[i, j] = gradient_magnitude[i, j]

        if (65.5 <= angle[i, j] < 112.5):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j]) and (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j]):
                nms_image[i, j] = gradient_magnitude[i, j]

        if (112.5 <= angle[i, j] < 157.5):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j - 1]) and (gradient_magnitude[i, j] >= gradient_magnitude[i - 1, j + 1]):
                nms_image[i, j] = gradient_magnitude[i, j]

# Display the result
plt.figure(figsize=(12, 8))
plt.imshow(nms_image, cmap='gray')
plt.title('Non-Maximum Suppression (8 Directions)')
plt.axis('off')
plt.show()
No description has been provided for this image
In [49]:
# Step 5: Double Thresholding
low_threshold=40
high_threshold=120

strong_edges = nms_image > high_threshold
weak_edges = (nms_image >= low_threshold) & (nms_image <= high_threshold)
double_threshold_image = np.zeros_like(nms_image, dtype=np.uint8)
double_threshold_image[strong_edges] = 255
double_threshold_image[weak_edges] = 75  # Mark weak edges with a different intensity for visualization

plt.figure(figsize=(12, 8))
plt.imshow(double_threshold_image, cmap='gray')
plt.title('Double Thresholding')
plt.axis('off')
plt.show()
No description has been provided for this image
In [50]:
# Step 6: Edge Tracking by Hysteresis
import numpy as np
import matplotlib.pyplot as plt
from skimage import io

def hysteresis_tracking(double_threshold_image):
    # Create a copy of the input image to work on
    final_edges = np.copy(double_threshold_image)
    rows, cols = double_threshold_image.shape
    
    # Define a stack-based approach to traverse all connected weak edges
    for i in range(1, rows - 1):
        for j in range(1, cols - 1):
            # If a pixel is a weak edge and is adjacent to a strong edge, propagate the edge
            if final_edges[i, j] == 75:  # Weak edge
                if 255 in final_edges[i-1:i+2, j-1:j+2]:  # Check if any neighboring pixel is a strong edge
                    stack = [(i, j)]
                    while stack:
                        x, y = stack.pop()
                        if final_edges[x, y] == 75:  # Mark connected weak edges as strong
                            final_edges[x, y] = 255
                            # Add all 8 neighboring pixels if they are weak edges
                            for nx, ny in [(x-1, y-1), (x-1, y), (x-1, y+1),
                                           (x, y-1), (x, y+1),
                                           (x+1, y-1), (x+1, y), (x+1, y+1)]:
                                if final_edges[nx, ny] == 75:
                                    stack.append((nx, ny))
                else:
                    final_edges[i, j] = 0  # Discard isolated weak edges
    
    return final_edges

# Apply the edge tracking by hysteresis
final_edges = hysteresis_tracking(double_threshold_image)

# Display the result
plt.figure(figsize=(18, 15))
plt.subplot(1, 2, 1)
plt.imshow(final_edges, cmap='gray')
plt.title('Edge Tracking by Hysteresis (final detected lines)')
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(image, cmap='gray')
plt.title('Original Image')
plt.axis('off')
plt.show()
No description has been provided for this image

Day 27¶

Demonstrate each individual step of your lines detection algorithm with your selected image. For which elements does your algorithm work well or poorly, in relation to your problem statement and use case?

In [51]:
inkart = cv.imread("img/inkart.jpg")
inkart_center = inkart[1000:1450, 750:1350]
image=inkart_center
In [52]:
# Step 1: Grayscale conversion
gray_image = cv.cvtColor(image, cv.COLOR_BGR2GRAY)
plt.figure(figsize=(12, 8))
plt.imshow(gray_image, cmap='gray')
plt.title('Grayscale Conversion')
plt.axis('off')
plt.show()
No description has been provided for this image
In [53]:
# Step 2: Gaussian Blurring to reduce noise
blur_kernel_size=5
blur_sigma=1.2

blurred_image = cv.GaussianBlur(src=gray_image, ksize=(blur_kernel_size,blur_kernel_size), sigmaX=blur_sigma)
plt.figure(figsize=(12, 8))
plt.imshow(blurred_image, cmap='gray')
plt.title(f'Gaussian Blur ({blur_kernel_size}*{blur_kernel_size}, σ={blur_sigma})')
plt.axis('off')
Out[53]:
(np.float64(-0.5), np.float64(599.5), np.float64(449.5), np.float64(-0.5))
No description has been provided for this image
In [54]:
# Step 3: Gradient Calculation using Sobel filters
ksize=3
grad_x = cv.Sobel(blurred_image, cv.CV_64F, 1, 0, ksize=ksize)
grad_y = cv.Sobel(blurred_image, cv.CV_64F, 0, 1, ksize=ksize)
gradient_magnitude = (np.abs(grad_x) + np.abs(grad_y)) / 2
gradient_direction = np.arctan2(grad_y, grad_x) * (180 / np.pi)

plt.figure(figsize=(20, 10))

plt.subplot(1, 2, 1)
plt.title('Sobel Gradient X')
plt.imshow(grad_x)
plt.axis('off')

plt.subplot(1, 2, 2)
plt.title('Sobel Gradient Y')
plt.imshow(grad_y)
plt.axis('off')
plt.show()

plt.figure(figsize=(12, 8))
plt.title('Gradient Magnitude')
plt.imshow(gradient_magnitude, cmap='gray')
plt.axis('off')
plt.show()
No description has been provided for this image
No description has been provided for this image
In [55]:
# Step 4: Non-Maximum Suppression (detailed for 8 directions)
nms_image = np.zeros_like(gradient_magnitude)
angle = gradient_direction % 180  # Constrain angles between 0° and 180°

for i in range(1, gray_image.shape[0] - 1):
    for j in range(1, gray_image.shape[1] - 1):
        # 0 degrees (horizontal)
        if (0 <= angle[i, j] < 22.5) or (157.5 <= angle[i, j] <= 180):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i, j + 1]) and (gradient_magnitude[i, j] >= gradient_magnitude[i, j - 1]):
                nms_image[i, j] = gradient_magnitude[i, j]
        
        # 45 degrees
        if (22.5 <= angle[i, j] < 65.5):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j + 1]) and (gradient_magnitude[i, j] >= gradient_magnitude[i - 1, j - 1]):
                nms_image[i, j] = gradient_magnitude[i, j]

        # 90 degrees (vertical)
        if (65.5 <= angle[i, j] < 112.5):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j]) and (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j]):
                nms_image[i, j] = gradient_magnitude[i, j]

        # 135 degrees
        if (112.5 <= angle[i, j] < 157.5):
            if (gradient_magnitude[i, j] >= gradient_magnitude[i + 1, j - 1]) and (gradient_magnitude[i, j] >= gradient_magnitude[i - 1, j + 1]):
                nms_image[i, j] = gradient_magnitude[i, j]

# Display the result
plt.figure(figsize=(18, 9))
plt.subplot(1, 2, 1)
plt.imshow(nms_image, cmap='gray')
plt.title('Non-Maximum Suppression (8 Directions)')
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(gradient_magnitude, cmap='gray')
plt.title('Gradient Magnitude (previous step)')
plt.axis('off')
plt.show()
No description has been provided for this image
In [56]:
# Step 5: Double Thresholding
low_threshold=40
high_threshold=100

strong_edges = nms_image > high_threshold
weak_edges = (nms_image >= low_threshold) & (nms_image <= high_threshold)
double_threshold_image = np.zeros_like(nms_image, dtype=np.uint8)
double_threshold_image[strong_edges] = 255
double_threshold_image[weak_edges] = 75  # Mark weak edges with a different intensity for visualization

plt.figure(figsize=(18, 11))
plt.subplot(1, 2, 1)
plt.imshow(double_threshold_image, cmap='gray')
plt.title('Double Thresholding')
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(nms_image, cmap='gray')
plt.title('Non-Maximum Suppression (previous step)')
plt.axis('off')

plt.show()
No description has been provided for this image
In [57]:
# Step 6: Edge Tracking by Hysteresis
import numpy as np
import matplotlib.pyplot as plt
from skimage import io

def hysteresis_tracking(double_threshold_image):
    # Create a copy of the input image to work on
    final_edges = np.copy(double_threshold_image)
    rows, cols = double_threshold_image.shape
    
    # Define a stack-based approach to traverse all connected weak edges
    for i in range(1, rows - 1):
        for j in range(1, cols - 1):
            # If a pixel is a weak edge and is adjacent to a strong edge, propagate the edge
            if final_edges[i, j] == 75:  # Weak edge
                if 255 in final_edges[i-1:i+2, j-1:j+2]:  # Check if any neighboring pixel is a strong edge
                    stack = [(i, j)]
                    while stack:
                        x, y = stack.pop()
                        if final_edges[x, y] == 75:  # Mark connected weak edges as strong
                            final_edges[x, y] = 255
                            # Add all 8 neighboring pixels if they are weak edges
                            for nx, ny in [(x-1, y-1), (x-1, y), (x-1, y+1),
                                           (x, y-1), (x, y+1),
                                           (x+1, y-1), (x+1, y), (x+1, y+1)]:
                                if final_edges[nx, ny] == 75:
                                    stack.append((nx, ny))
                else:
                    final_edges[i, j] = 0  # Discard isolated weak edges
    
    return final_edges

# Apply the edge tracking by hysteresis
final_edges = hysteresis_tracking(double_threshold_image)

# Display the result
plt.figure(figsize=(18, 15))
plt.subplot(1, 2, 1)
plt.imshow(final_edges, cmap='gray')
plt.title('Edge Tracking by Hysteresis (final detected lines)')
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(double_threshold_image, cmap='gray')
plt.title('Double Thresholding (previous step)')
plt.axis('off')
plt.show()
No description has been provided for this image
In [58]:
def overlay_edges_on_image(original_image, edges_image):
    # Ensure edges are in the correct format for overlaying
    edges_colored = cv.cvtColor(edges_image, cv.COLOR_GRAY2BGR)
    edges_colored[edges_image == 255] = [0, 255, 0]  # Make edges green

    # Overlay edges on the original image
    overlaid_image = cv.addWeighted(original_image, 0.8, edges_colored, 0.5, 0)
    
    return overlaid_image

# Overlay the detected edges on the original image
overlaid_image = overlay_edges_on_image(image, final_edges)

# Display the overlaid image
plt.figure(figsize=(12, 8))
plt.imshow(overlaid_image)
plt.title('Detected Edges Overlayed on Original Image')
plt.axis('off')
plt.show()
No description has been provided for this image
In [59]:
# Step 7: Detect straght lines using Hough Transform

lines = cv.HoughLinesP(image=final_edges,
                       rho=1,
                       theta=(np.pi / 180),
                       threshold=110, 
                       minLineLength=35,
                       maxLineGap=5
                       )

line_image = np.copy(image) * 0  # Create a black image to draw lines on

if lines is not None:
    for line in lines:
        for x1, y1, x2, y2 in line:
            cv.line(line_image, (x1, y1), (x2, y2), (0, 255, 0), 2)

# Superimpose the lines on the original image
result = cv.addWeighted(image, 0.8, line_image, 1, 0)

plt.figure(figsize=(12,8))
plt.imshow(result)
plt.title('Detected Lines')
plt.axis('off')
Out[59]:
(np.float64(-0.5), np.float64(599.5), np.float64(449.5), np.float64(-0.5))
No description has been provided for this image

Day 28¶

Discuss your choice of data, methods, and parameters, as well as the achieved results, in approximately 150 words in relation to your problem statement and use case. Why did you choose this image?

I chose this image as a challenge because of its wide variety of edge strengths and orientations, which provided a good opportunity to explore how different parameter settings influence the algorithm's performance in each step. It turned out to be more challenging than I initially expected, but overall, I am mostly satisfied with the results.

The algorithm still struggles with detecting certain prominent straight lines that are visually obvious, such as the two diagonal ones along the edges of the arms. The edge detection results look acceptable with the current parameters, though I would have liked to capture more lines. Attempts to include these additional lines often led to an excessive number of false detections, likely due to the surface corrosion and weathering of the stone, given its age. Additionally, the algorithm seems to detect horizontal lines more reliably than vertical ones, which could be related to the angle of sunlight when I took the picture.

In [60]:
time_now = time.time()
passed = time_now - start_time
print(f"Time taken: {passed:.2f} seconds")
Time taken: 296.31 seconds